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Introduction 


The first book in this course focused on wave mechanics, a way of thinking 
about quantum mechanics in terms of wave functions and wave equations. 

This book presents a more general and powerful approach to quantum theory, 
which encompasses wave mechanics, but extends it in new directions. This new 
approach is properly called guantum mechanics. Towards the end of the book, we 
will also raise some issues related to the meaning and interpretation of quantum 
theory, emphasizing the evidence provided by modern experiments. 


Quantum mechanics is hugely successful in its predictions. Some of this success 
is apparent from the wave mechanics you studied in Book 1; you will meet further 
successes in this book, and even more in Book 3, which uses quantum mechanics 
to explain the properties of atoms, molecules and solids. The successes are 
precise and quantitative. Equally important, quantum-mechanical concepts such 
as quantization and tunnelling have given us fresh ways of interpreting the 

world, leading to new scientific explanations and major advances in technology, 
including transistors and lasers. 


But the undoubted success of quantum mechanics does not lessen the shock of its 
claims, including the idea that fundamental laws of physics are probabilistic. 
Even the scientists who created quantum theory struggled to come to terms with it 
and Einstein believed that the probabilities of quantum mechanics only betray 
our ignorance of the real states of systems. In 1935, Einstein, Podolsky and 
Rosen proposed a thought experiment which they claimed demonstrated that 
quantum mechanics was incomplete; otherwise, the outcomes of the experiment 
would be contrary to common sense. But, many years later, such experiments 
have been performed and quantum mechanics stands vindicated. Far from 
undermining quantum mechanics, Einstein’s thought experiment only shows us 
how extraordinary the world is. 


Erwin Schrédinger gave the name ‘entanglement’ to the key feature of quantum 
mechanics exposed by Einstein’s argument. Entanglement is also the key to some 
truly remarkable applications that are just beginning to be developed. These 
modern applications are known collectively as quantum information. One of the 
most advanced areas of quantum information is quantum cryptography, which lies 
at the heart of protocols that allow encrypted information to be sent from one 
person to another with absolutely no fear of eavesdropping. Internet shopping 
companies and banks are, of course, very interested. Astonishingly, it is also 
possible to teleport the unmeasured state of a photon from one place to another 
— not quite the ‘beaming-up’ of people, but an impressive feat nonetheless. 

The greatest prize for physicists working in the field of quantum information 
would be to construct practical computers whose logical processes are based on 
quantum-mechanical principles, including the principle of superposition. Again, 
this subject is still in its infancy, but intensive effort is currently being devoted to 
making quantum computing a reality. 


These applications will be discussed towards the end of this book but, before we 
can describe them, we need to go deeper into the theory of quantum mechanics 
itself. This theory is, of course, the central theme of this course, and is the 
foundation upon which all applications rely. 


Introduction 


Chapter 1 presents a fresh way of thinking about quantum mechanics — one that 
shifts attention away from wave functions and towards basic notions about vectors 
and operators. This change in emphasis is accompanied by a notation invented by 
Dirac which will be a valuable ally throughout this book. Because it deals with 
notation and the rules for manipulating symbols, Chapter 1 may seem rather 
formal, but it will give you the chance to master techniques that will be useful 
later on. Chapter 1 also shows where Heisenberg’s uncertainty principle and 
Ehrenfest’s theorem come from, and how conservation laws arise in the world of 
atoms. 


Book 1 was largely devoted to one-dimensional systems. It will not have escaped 
your notice that most interesting things around us are not one-dimensional. A key 
concept needed in descriptions of real systems in more than one dimension is 
angular momentum. Chapter 2 shows how angular momentum is represented in 
quantum mechanics; you will see that the route to the operators that describe 
angular momentum is similar to the route that led to Schrédinger’s equation in 
Book 1. You will also see that molecules have quantized energies associated with 
their rotational motion, and the extent to which atomic states can be labelled by 
quantum numbers for angular momentum. 


Nature has a huge surprise for us: there is also a purely quantum-mechanical 
form of angular momentum called spin, a property of electrons and many other 
fundamental particles of which matter is composed. The quantum formalism for 
spin is the subject of Chapter 3, and the concept of spin will be exploited in 
almost every chapter that follows. 


The phrase ‘a property of electrons’ suggests that all electrons have the exactly 
the same properties, and indeed they do. All electrons are absolutely identical, 

as are all protons, all neutrons, and so on. Such identity of particles has no 
parallel in the world of everyday objects. Yet, at the same time, it has a profound 
significance for what we see in the everyday world. One consequence of particle 
identity is the Pauli exclusion principle, which has a decisive influence on the 
physical and chemical properties of matter. This reminds us that real systems, like 
atoms or solids, consist of more than one particle. The way quantum mechanics 
handles systems of more than one particle, whether identical or nor, is the subject 
of Chapter 4. 


There follows a relatively short Chapter 5 that will give you the opportunity to 
pause and take stock. This chapter summarizes the formalism developed in 

the first four chapters and clarifies some issues associated with the process of 
measurement in quantum mechanics and with the description of quantities 

that have a continuum of possible values. The chapter closes by updating the 
preliminary principles of wave mechanics given in Chapter 2 of Book 1, 
presenting a revised list of the principles of quantum mechanics that encapsulates 
the key ideas on which the whole subject is based. 


The last two physics chapters, Chapters 6 and 7, are devoted to the subject of 
entanglement and its applications. Chapter 8 is a Mathematical toolkit, which 
covers the mathematics of vectors, vector spaces and matrices. This chapter is 
designed to support the physics chapters, especially Chapters 1 and 3, so do not 
delay your reading of it until the end. 


eee 
Chapter | A new language for quantum mechanics 


Introduction 


‘By relieving the brain of all unnecessary work, a good notation sets it free 
to concentrate on more advanced problems ...’ 
A.N. Whitehead 


The wave mechanics in the first book of this course was presented in the language 
of calculus. Energy eigenfunctions were obtained by solving differential 
equations, and expectation values and uncertainties were found by evaluating 
integrals. This chapter will introduce an alternative notation for quantum 
mechanics — one that emphasizes different aspects of the subject. 


Notation is not a trivial issue. Many breakthroughs owe their existence to the 
invention of friendly notation. Roman accountants, for example, must have found 
it exhausting to multiply or divide two long numbers; nowadays, we have a 
much better way of writing numbers, with different columns for units, tens and 
hundreds, and a special symbol for zero, so basic arithmetic is far easier for us. 
The notation used for calculus is another case in point. Newton used dots to 
indicate differentiation while his rival, Leibniz, used the dx /d¢ notation. It has 
been said that the refusal of English mathematicians to adopt Leibniz’s notation 
inhibited the development of mathematics in England for more than a century. 


This chapter presents a famous notation that was developed for quantum 
mechanics by the Nobel prize-winning physicist Paul Dirac. Behind Dirac’s 
notation, there is a striking insight — that the state of a system can be represented 
by a vector in an abstract space. This allows us to think about quantum mechanics 
in geometric terms. Dirac’s notation has other advantages as well. Later in this 
book, you will see that it can be used to describe the quantum property of spin, 
something that is beyond the scope of ordinary wave mechanics. 


A second theme of this chapter is based on the fact that measurements give real, 
rather than complex, values. When you measure the energy of an electron, for 
example, you might get 6.8 eV, but you will never get (6.8 — 3.7i) eV. This fact is 
so obvious that you might not give it a second thought, but it turns out to have 
profound consequences. It implies that the operators used to represent observable 
quantities in quantum mechanics must be of a special kind; they are called 
Hermitian operators. When this fact is combined with Dirac notation, we obtain a 
powerful set of tools that can be used to prove many important results in quantum 
mechanics, including Ehrenfest’s theorem and the uncertainty principle. 


The chapter is organized as follows. Section 1.1 introduces the idea that quantum 
states can be represented by vectors in an abstract space, and Section 1.2 then 
goes on to develop Dirac’s notation. Section 1.3 introduces the concept of 

a Hermitian operator, based on the fact that observable quantities have real 
values. The methods developed in the first half of the chapter are then used in 
Sections 1.4 and 1.5 to establish two results which were stated without proof in 
the first book of this course: Ehrenfest’s theorem and the Heisenberg uncertainty 
principle. In both cases, we shall derive results more general than those discussed 
in Book 1 and, in the case of Ehrenfest’s theorem, we shall discuss the profound 
relationship between conservation laws and symmetries. 


Chapter | A new language for quantum mechanics 


References to the 
Mathematical toolkit 
mean the last chapter of 
this book, unless stated 
otherwise. 


Figure I.I (a) A silicon 
nitride diffraction grating 
fabricated so that its slits 
have a spacing of 100 nm. 
(b) An interference pattern 
formed by a beam of helium 
atoms passing through the 
grating in (a). 
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You may find the chapter rather abstract. This is because we are laying down 
the ‘rules of the game’ of quantum mechanics and exploring their general 
consequences. Later chapters will build on the rules introduced here, and use 
them to describe the behaviour of specific physical systems. 


One of the themes in this chapter is the use of vectors in an abstract vector 
space. It is therefore advisable to refresh your memory of ordinary vectors in 
three-dimensional space by reading Section 8.1 of the Mathematical toolkit 
now. 


l.l A geometric view of quantum mechanics 


1.1.1 Quantum states as vectors 


Figure 1.1 shows an example of quantum-mechanical interference. A beam of 
helium atoms is sent through the slits shown in Figure 1.1a, and the interference 
pattern shown in Figure 1.1b is produced. At first sight, this is an example of 
atoms behaving as waves, but there is more to it than this; the interference pattern 
appears spot-by-spot as each atom is detected on the screen, so the experiment 
shows that atoms can exhibit the properties of both particles and waves. 
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The underlying reason for quantum interference is the principle of superposition. 
This tells us that, if Yı (x,t), Y2(x, t), Y3(x, t), ... are possible wave functions 
of a system, then the linear combination 


(x,t) = a, V1 (x,t) + agWo(z,t) + a3V3(2,t) +- (1.1) 


is also a possible wave function of the system, provided that U(x, t) is 
normalized. This can always be achieved by multiplying the right-hand side of 
Equation 1.1 by a suitable constant. 


In the case of the interference pattern in Figure 1.1b, Y;(x, t) corresponds to a 
wave emerging from the ith slit in Figure 1.la. Usually, no information is 
available about which slit the particle passed through. In such a case, we must 
suppose that the wave function is a linear combination of contributions associated 
with passage through different slits. These contributions interfere with one 
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another and produce an interference pattern on the detecting screen. The twist is 
that Yı (x,t), Vo(x,t), V3(a,t), ... describe different ways in which a single 
particle can propagate from the slits to the detecting screen. The propagating 
particle interferes with itself. 


The principle of superposition tells us that any normalized linear combination of 
wave functions yields another wave function. Conversely, we can also regard a 
given wave function as being a linear combination of parts. 


For example, when studying wave packets in Book 1, you saw that the harmonic 
oscillator energy eigenfunctions form a complete set. This important property 
means that any reasonable function can be expanded as a linear combination of 
these eigenfunctions. So, if a harmonic oscillator is described by the wave 


function U(x, 0) at time t = 0, we can always write 
This sum starts from zero 


Co 
because the lowest quantum 
W(x,0) = i Wi 1.2 
ee) 2 cil); ta number for a harmonic oscillator 
E isn = 0. 
where the functions 7;(x) are energy eigenfunctions of the harmonic oscillator, 
the coefficients c; are complex constants, and the sum may contain infinitely many 
terms. 
The harmonic oscillator eigenfunctions obey the condition 
H 1 if*= 9 
*(x) Ww; (x) dx = ði; = , 1.3 
J EOE) = 6, : a a3) 
We say that they are normalized and mutually orthogonal or, equivalently, that 
they are orthonormal. Because the energy eigenfunctions are orthonormal, we can 
find each unknown coefficient c; in the wave packet as follows. We multiply both 
sides of Equation 1.2 by Y; (x), and integrate over all x to obtain 
lee) oo oo 
f Grea af golear (4) 
=p0 i=0 3a 
The orthonormality property (Equation 1.3) then gives 
OF 2 The Kronecker delta symbol 
f Y} (x) U(x, 0) dx = LS Ciðji = Cj, (1.5) kills off all terms in the sum 
=O: 


=v except for that with i = j. 
so each coefficient can be found by evaluating the integral on the left-hand side of 
Equation 1.5; this was called an overlap integral in Book 1. 
@ Do Equations 1.1-1.5 remind you of anything else in mathematics? 
O If you have read Section 8.1 of the Mathematical toolkit, you might have 
sensed that there is an analogy with the mathematics of vectors. 


We shall now examine this analogy in detail. There are four points of comparison. 


1. Given any set of vectors v1, V2, V3, ... in ordinary three-dimensional space, 
and any set of real constants a1, a2, a3,..., we can form the linear combination 
vV =a {V1 + a2V2 + a3v3+::-, (1.6) 


and this is also a vector in ordinary space. Equation 1.1 can be regarded as the 
quantum-mechanical analogue of this result. 
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Figure 1.2 (a) Three 

basis vectors in ordinary 
three-dimensional space. 
Because we are interested 

in generalizing to many 
dimensions, we have labelled 
these vectors e1, e2 and e3 
rather than ez, €y and ez. 

(b) The components v1, v2 and 
v3 of a vector v are found by 
projecting onto axes defined by 
the three basis vectors. 
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2. In ordinary three-dimensional space, we can choose a set of three normalized 
and mutually orthogonal basis vectors, which we have labelled e1, eg and e3 in 
Figure 1.2. Because this set of vectors is orthonormal, we can write 


€; ej; = Oi; (1.7) 


Equation 1.3 is the quantum-mechanical analogue of this property. 


e3 


€1 


(a) (b) 


3. The three basis vectors form a complete set, and are said to provide a basis 
for ordinary three-dimensional space; they are also said to span this space. This 
means that any vector v can be written as a linear combination of the basis 
vectors: 
3 
vV = V1€E1 + V2€2 + V3e3 = Xvi ej. 
i=1 


(1.8) 


Equation 1.2 is the quantum-mechanical analogue of this expansion. The 
coefficients v; are called the components of the vector v, and their geometric 
significance is shown in Figure 1.2b. 


4. Because the basis vectors are orthonormal, we can find the jth component 
of the vector v by taking its scalar product with the basis vector e;. Using 
Equation 1.8, we obtain 


3 
epv= ou (ej + e;), (1.9) 
i=1 
and the orthonormality of the basis vectors then gives 
3 
eg v= >) uiy = Uj. (1.10) 
i=l 


Equations 1.4 and 1.5 are the quantum-mechanical analogues of these last two 
results. 


1.1 A geometric view of quantum mechanics 


To summarize, the analogy between wave functions and vectors is based on 
the following comparisons. 


Equations 1.1 and 1.6: A linear combination of wave functions is 
analogous to a linear combination of vectors. 


Equations 1.2 and 1.8: Any wave function can be written as a linear 
combination of eigenfunctions from a complete set, just as any vector can be 
written as a linear combination of basis vectors. The complete set of 
eigenfunctions is analogous to the complete set of basis vectors. 


Equations 1.3 and 1.7: A set of orthonormal eigenfunctions 7; (2) is 
analogous to a set of orthonormal basis vectors e;. 


Equations 1.4 and 1.9: An overlap integral is analogous to a scalar 
product. 


Equations 1.5 and 1.10: A coefficient c; in a wave packet can be found 
by evaluating an overlap integral, just as a component vj in a vector can be 
found by evaluating a scalar product. 


This chapter will build on this analogy. First, we shall think of functions as being 
‘vectors’ in an abstract vector space. For example, we shall think of e~” 
pointing in one direction, and e~ sin z as pointing in another direction, in a 
space that represents all functions. 


2 
COS £ aS 


In fact, quantum mechanics restricts attention to complex-valued functions that 
can be normalized, that is, functions ¢)(x) for which 


co 
/ \u(x)|? da is finite. 
-00 
The vector space used to represent normalizable functions is called function 
space. Wave functions and eigenfunctions are normalizable functions, so we say 
that they correspond to vectors in function space. This is just a description — a 
choice of words — but it is one that emphasizes the close analogies between wave 
functions and vectors that were outlined above. 


1.1.2 Dirac notation 


Many physicists appreciated the analogy between wave functions and vectors, but 
it was Paul Dirac (Figure 1.3) who invented a notation that fully captures the spirit 
of this analogy. Although Dirac was one of the pioneers of quantum mechanics, 
making fundamental contributions from 1925 onwards, he only developed his 
notation in 1939. The notation rapidly caught on and is now a favorite choice of 
physicists. 


Ordinary vectors are usually printed in bold type (e.g. r), but it would be 
confusing to use the same convention for vectors in function space. Dirac 
therefore devised a new and distinctive notation: to denote a particular vector in 
function space, he used an angled bracket | }, which he called a ket vector. 
The contents of the angled bracket indicate the function under discussion. For 
example, we can write |f} to denote the ket vector for the function f(x). 


The general mathematical 
concept of a vector space is 
discussed in Section 8.2 of the 
Mathematical toolkit. This is 
background material, which you 
may read at any time. 
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Figure 1.3 Paul Dirac 
(1902-1984) was one of the 
pioneers of quantum mechanics. 
He shared the 1933 Nobel 

prize for physics with Erwin 
Schrödinger. 
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However, one advantage of Dirac notation is that the contents of the angled 
bracket can be anything that sensibly labels the function. For example, the 
harmonic oscillator energy eigenfunction Yn(x), with eigenvalue E, and quantum 
number n, could be written as 


lyn) |En) or |n). 
We can even put words or numbers inside the angled bracket, as in 
|ground state), |0) or |n = 0). 


Note, however, that function arguments are always omitted: we can write |m), 
but we never write |q,(x)). 


In wave mechanics, the state of the system at a given time t is described by the 
wave function U(x,t). The ket vector corresponding to the wave function is 
called the state vector, and can be denoted by a symbol such as |W). As time 
passes, the wave function changes, so the state vector continuously changes its 
direction in function space. By contrast, the energy eigenfunctions y(x) are 
time-independent, and are represented by static vectors in function space. Usually, 
we do not bother to show the time-dependence of the state vector in our notation, 
but interpret the symbol |) as meaning the state vector at the time of interest; if 
necessary we can always indicate the time concerned by something like |Y initia] 
We can now begin to write some equations using Dirac notation. An example is 
given by Equation 1.1, which we now write in the form 


|W) = ay |W1) + ag |W2) + a3 |Y) +..., 
obtained simply by replacing functions by the corresponding ket vectors. 


A crucial part of Dirac notation is the way it deals with overlap integrals. We have 
already seen that the overlap integral in Equation 1.5 is analogous to the scalar 
product in Equation 1.10. When we go beyond ordinary space, the term ‘scalar 
product’ is usually replaced by the more general term ‘inner product’. We are 
therefore led to think of an overlap integral as an inner product between vectors in 
function space. 


Dirac denoted the inner product of two ket vectors |f} and |g) by the symbol 
(f|g), and he identified this with the overlap integral of the corresponding 
functions f(a) and g(x). In other words, he defined 


inner product of | f)and |g) = (f|g) = ie 7 ayo adae: CeIn) 


The symbol (f|g) can be thought of as a shorthand for the overlap integral that 
appears on the right-hand side of Equation 1.11. We shall call it the Dirac 
bracket of the functions f(x) and g(x). It is important to note that the function 
f(x) in the left-hand slot of a Dirac bracket is complex-conjugated in the overlap 
integral. In general, the two functions f(x) and g(x) have complex values, and 
the Dirac bracket (fg) is a complex number. 


Equation 1.11 applies to functions of a single variable. When describing a particle 
in three dimensions, the wave function depends on three spatial coordinates. 
Under these circumstances, the Dirac bracket of | f} and |g), corresponding to the 


1.1 A geometric view of quantum mechanics 


functions f(x,y,z) and g(x,y, z), is defined by 


Go = L. n i (a4, Na 2dedydz (1.12) 


For simplicity, we will discuss the one-dimensional case here, but the extension to 
three dimensions is straightforward. In any case, Dirac notation is unaffected by 
the number of dimensions, and this is one of its advantages. 


To illustrate Dirac notation in action, let’s rewrite Equations 1.2—1.5 using vectors 
and Dirac brackets. 
Equation 1.2 is written as 
(oe) 
|v) = y alu) (1.13) 
i=0 
Equation 1.3 takes the form 


(Wilby) = diz. (1.14) 


Equation 1.4 then becomes 


(W;|V) = ye (pili). (1.15) 


Finally, Equation a is written as 


(Wt) = Yes ae (1.16) 
Turning this last equation around, we see that the coefficient c; is given by 


cj = (Y;|V) = f Wj (x) U(x, 0) dz. (1.17) 


All we have done to convert Equations 1.2—1.5 into Equations 1.13-1.16 is to 
replace functions by vectors and overlap integrals by Dirac brackets. You are 
strongly advised to check how this works in each case. 


Dirac notation is a sort of shorthand. Rather than writing down a cumbersome 
overlap integral, we simply write down the corresponding Dirac bracket. If we 
need to evaluate an overlap integral, we usually have to write it out in full so that 
we can use the techniques of calculus. However, there are many occasions when 
we do not need to do this, and it is here that Dirac notation is invaluable — 
offering us reductions in time, effort and clutter on the page. 


Yet Dirac notation is more than a shorthand. It also emphasizes the close analogy 
between ket vectors and ordinary vectors, and you have seen that the Dirac bracket 
(f |g) can be thought of as an inner product, analogous to the scalar product a » b 
between ordinary vectors. The next subsection will discuss the extent to which 
ordinary geometric language and pictures can be used in function space. 


Exercise l.l Use Dirac notation to write down: (a) the normalization condition 
for a wave function Y; (b) the probability that an energy measurement on a 
system in a state Y will yield the discrete energy eigenvalue F; corresponding to 
the energy eigenfunction 7; (x). m 
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1.1.3 Picturing vectors in function space 


At first sight, there are some major differences between ordinary 
three-dimensional space and function space. For example, the scalar product a - b 
in ordinary space is a real quantity, but the inner product in function space is given 
by the Dirac bracket 


(tl) = f "Feneis 


which involves complex functions, and is therefore complex in general. 


Three-dimensional space has a set of three basis vectors e1, €2 and e3, which 
means that any vector a can be expressed as a sum 


3 
a= ) Qi ej. 
i=1 


Although a vector |) in function space can also be expressed as a sum 


IY) = Soci lv), 
i=0 


in this case the sum may involve an infinite number of terms; in this sense, 
function space has an infinite number of dimensions. 


In spite of these differences, a close analogy remains. The analogy is strengthened 
by the fact that we can extend the concept of the magnitude of a vector into 
function space. In ordinary three-dimensional space, the magnitude of the vector 
a is given by 


a = |a| = Va-a, 


where the positive square root is taken. There is no difficulty in doing this because 
a.a =a? +a} + a? is real and non-negative. The magnitude of a is therefore 
real and non-negative too. In function space, the inner product (f|g) is complex, 


but putting g(x) = f(x) gives 
(FIF) = T f= (x) f(x) dz = 1 |f(a)|° de. 


The integrand is real and non-negative everywhere, so (f|f) is also real and 
non-negative. We can therefore take the positive square root to obtain a real, 
non-negative quantity \/(f|f) which is interpreted as the magnitude of | f}. In 
practice, we generally use the word norm instead of magnitude when dealing 
with vectors in function space, and therefore say that 


norm of |f) = /(f|f) = 0. (1.18) 


A vector with zero norm is called the zero vector, while a vector with unit norm 
is said to be normalized. 


Ordinary vectors in three-dimensional space also obey the inequality 
(a-a)(b-b) > (a-b)’, 
which follows from the identity 
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a?b? > (ab cos 0)” = ab? cos? 0 = (a+ b)’. 


It is interesting to note that vectors in function space satisfy a similar inequality: 


(fA) (gla) > KFI). 


A geometric view of quantum mechanics 


Remember that a - b = ab cos 9, 
where @ is the angle between the 
directions of a and b. 


(1.19) 


This is known as the Cauchy—Schwarz inequality; later in the chapter, you will 


see that it is a key ingredient in proving the uncertainty principle. 


These analogies suggest that it is reasonable to use basic geometric notions 

in function space. In particular, we shall say that | f} is orthogonal to |g) if 
(f|g) = 0. Here, the phrase ‘orthogonal to’ is used in analogy to its ordinary 
geometric sense: ‘at right angles to’. Of course, we anticipated this terminology 
much earlier in the course when we described two functions f(x) and g(x) witha 
vanishing overlap integral (f|g) as being orthogonal to one another. 


We have seen that a harmonic oscillator has an infinite set of energy 
eigenfunctions Yı (x), q(x), ..., with a corresponding set of ket vectors 

lW), |tbg), .... These vectors are normalized and orthogonal, and are therefore 
said to be orthonormal. We also know that the harmonic oscillator energy 
eigenfunctions are a complete set, so that any vector |) in function space can be 


expressed as 


|B) = $` c l), 
i=0 


(1.20) 


where the c; are complex scalars. We shall describe this fact using the same 


language as for ordinary vectors. The vectors |y), |2), ... 


will be called basis 


vectors. We shall say that these basis vectors form a complete set or a basis in 
function space, or equivalently, that they span function space. All of these 
statements express the fact that any vector in function space can be expressed as a 
linear combination of the basis vectors |y;}. However, it is worth noting that the 
harmonic oscillator energy eigenfunctions are not unique in this respect. They 
provide one example of a basis in function space, but many other sets of functions 


provide alternative bases. 


By analogy with ordinary vectors, the coefficient c; in 

Equation 1.20 can be called the (scalar) component of the vector 
|W) in the direction of the basis vector |y;). Because the sum 

in Equation 1.20 involves an infinite number of orthogonal basis 
vectors, function space has an infinite number of dimensions, and 
vectors within it have an infinite number of complex components. 
It is not possible to visualize this situation in any realistic way. 
Even so, it is helpful to draw some ‘cartoons’, which should not 
be taken too literally but still capture the essence of the situation. 


Figure 1.4 is a ‘cartoon’ representing the expansion of a state 
vector |W) in terms of the basis vectors |y;). Compromises 

have been made in order to draw this sketch: we show only two 
of the basis vectors, |71) and |¢2), and the components cı and 

Cg are represented by real (rather than complex) numbers. In spite 
of these deficiencies, the figure illustrates some important points: 


eo) 


cy 


probability 
amplitude 
for Es 


O 


probability 
amplitude 
for Ey, 


Figure 1.4 A sketch indicating the 
relationship between a state vector |W) and a 


basis of energy eigenvectors |1), |y2),.... 17 
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1. The basis vectors |71) and |y2) both have the same (unit) length and are 
drawn perpendicular to one another. This is because they are normalized and 
mutually orthogonal. The state vector |W) describes the state of a system at a 
given time. It has the same length as the basis vectors because it is 
normalized too. 


2. The components cı and cz are found by a process of projection, dropping 
perpendiculars onto the directions of the basis vectors. This is similar to the 
picture for ordinary vectors given in Figure 1.2b. 


3. The components cı and cz are also the coefficients of the energy 
eigenfunctions Yı (x) and Y2(x) in the wave function. They are therefore 
interpreted as the probability amplitudes for getting the energy eigenvalues 
E and FE in an energy measurement. The corresponding probabilities are 
|c1|? and |c2|?. In the situation shown in Figure 1.4, both E; and Eù are 
possible values, but Æ; is more likely than E> because |c1|? > |c2|?. 


Figure 1.5 A sketch 
showing a state vector |Y) 
and two sets of basis vectors, 


(11), [2)) and ($1), 192). 


Another important point is illustrated in Figure 1.5. Many different sets of basis 
vectors span function space, and the components of a given state vector depend on 
the choice of basis. What is the physical significance of this geometric fact? 

In quantum mechanics, different bases correspond to different measurable 
quantities. By projecting the state vector onto the red set of basis vectors, we get 
the probability amplitudes for one set of quantities. By projecting the state 
vector onto the blue set of basis vectors, we get the probability amplitudes for a 
different set of quantities. This helps us visualize the fact that the state vector (or 
equivalently the wave function) is a complete description of the state of the 
system; by projecting the state vector onto an appropriate basis, we can find the 
probability of any given experimental outcome. 


It is worth saying once more that the sketches in this subsection have their 
limitations; we cannot hope to give faithful pictures of vectors with many complex 
components. Nevertheless, the simple sketches given here illustrate the sort of 
image most physicists carry in their heads when they talk of projecting a state 
vector onto a basis. Dirac was known for being terse and literal-minded, and his 
famous book on quantum mechanics contained no diagrams; however, it is 
interesting to learn that his personal notebooks were full of them, and he declared 
a personal preference for ‘relationships which I can visualize in geometric terms’. 
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1.2 Using Dirac notation 


1.2.1 Manipulating Dirac brackets 


We would like to carry out calculations using vectors and Dirac brackets without 
having to justify each step by referring back to functions and overlap integrals. To 
make this possible, you need to know some rules that apply to all Dirac brackets, 
so that you can manipulate them routinely. This section will state and derive the 
few rules that are required. 


I. Complex conjugation 


An overlap integral is a complex number, so we can take its complex conjugate: 
oo š co 
pi f(x) g(x) dz] = f tee (ede Remember that ( f*)* = f. 
—oo ma 
=f oa) fede. 
=cO 


In terms of Dirac brackets, this result can be expressed as 


(Flo) = lf). (1.21) 


We must therefore be careful about the order of terms in a Dirac bracket: (f|g) is 
generally different from (g| f}. 


Exercise 1.2 Use Equation 1.21 to show that (f|f), (f|g)(g|f) and 
(flg) + (g|f) are all real quantities. E Remember that an expression is 
real if it is equal to its own 


2. Taking constants outside Dirac brackets complex conjugate. 


Any multiplicative constant c can be taken outside an integral, so we have 


f ” f*(a) [eg(2)) dz =e f ” f(a) glx) de 
and 


f i [eg(x)]"f(a) da = c* 1 i g*(x) f(x) da. 


=p =00 


In terms of Dirac brackets, these results can be expressed as 


(fleg) = c (f|g}, (1.22) 
(calf) = č (g| f) (23) 


Notice the star in Equation 1.23. A constant in the right-hand slot of a Dirac 
bracket can be extracted from the bracket without change, but a constant in the 
left-hand slot must be complex-conjugated when it is extracted. This rule will be 
used throughout this chapter; do not get caught out by forgetting it! 


Exercise 1.3 Simplify (f|f), where f(x) = e!%g(x) and a is a real 
constant. E 
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3. Dirac brackets of linear combinations of functions 


The integral of any sum of functions is a sum of integrals, so it immediately 
follows that 


(Flg +h) = (Flg) + (Fih) (1.24) 
and 
(g+ Alf) = (golf) + (Alf). (1.25) 


Combining these results with Equations 1.22 and 1.23, we see that: 


Kal) = X ae en 


Go = De A (1.26) 


(glf) = 2 ct (gil f). (1.27) 


So the Dirac bracket of a linear combination of functions can be expanded as a 
linear combination of Dirac brackets; however, any constants extracted from the 
left-hand slots of the brackets must be complex-conjugated. 


Exercise 1.4 Simplify (f|f +ig) — (g —iflg), given that 
(fl) = (l9). E 


1.2.2 Bra and ket vectors 


It is sometimes convenient to think of (f|g) as being formed from two separate 
entities, (f| and |g), which are joined together. Dirac called (f| a bra vector and 
|g) a ket vector — simply so that he could say that a bra and a ket join up to give 
a bra-ket (a bracket)! 


We can obtain valid equations for bra and ket vectors by looking at our results for 
Dirac brackets. For example, if we strip away (f| from Equation 1.22 and strip 
away |f) from Equation 1.23, we get 


leg) =clg) and (cg| =c* (gl. (1.28) 


More generally, if we strip away (f| from Equation 1.26 and strip away |f} from 
Equation 1.27, we get 


lg) =S ela) and = (gil. (1.29) 


This gives us the following rule: 


To convert a ket vector |g) = }°; c; |g;) into the corresponding bra vector 
(g| = X; G (gil, we replace all the ket vectors by their corresponding bra 
vectors, and all the coefficients by their complex conjugates. 


1.2 Using Dirac notation 


The following example will show how this rule is used. 


Worked Example 1.1 Essential skill 


Given two vectors Using bra and ket notation and 
oo o0 dummy indices 
S ES di |i), 
i=0 i=0 


where the vectors |7;) are a complete orthonormal set of energy 
eigenfunctions, express the Dirac bracket (f|g) in terms of the coefficients 
ay and bi. 


Solution 


co 
Given that | f) = SS a; |i), we use Equation 1.29 to write 
i=0 


foe) co 
= ` až (w;|, which must be joined to |g) = x bi |i). 


To avoid omitting ‘cross-product’ terms, we take the precaution of using 
different indices in the two sums. Changing the dummy index in the sum for 
|g) from 7 to j, we obtain 


(f\g) = (San (wl) (Sos 0). 


Regrouping terms, and noting that each (y;| on the left can join up with each 
|~;) on the right to give (y;|7;), we obtain 


(flg) = See (Wily). 


i= 7=O0 


The energy eigenfunctions are orthonormal, so (7j|~;) = ĝi; giving 


Go) = Sona bjðij. 


g= E0 


Finally, the Kronecker delta symbol kills off all terms in the double sum 
except those with 7 = 7, so we are left with the single sum 


Vp) = Ya bi. (1.30) 


Equation 1.30 provides another analogy with ordinary vectors in 
three-dimensional space. You will recall that the scalar product of two ordinary 
vectors is given by 


a+b = abı + a2b2 + a3b3, 
which is a sum of products of components. Equation 1.30 is the natural extension 


pal 
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of this formula to function space; it further supports our interpretation of Dirac 
brackets as inner products in function space. 


It is also interesting to interpret Equation 1.30 when |f} = |g) = |W), a state 
vector. In this special case, it becomes 


(DW) = So ata = y Jail’. (1.31) 
i=0 i=0 


The left-hand side can be recognized as the norm of |), which is equal to 1. On 

the right-hand side, |a;|? is the probability of measuring the ith energy eigenvalue. 
The sum extends over all possible values, so the right-hand side is the probability 

of measuring one or another of the allowed energies, which must also be equal 

to 1. 


Exercise 1.5 Two vectors |a) and |b) in function space are not orthogonal to 
one another. What value of the constant 3 must be chosen to ensure that the vector 
|c) = |a) + 8 |b} is orthogonal to |a)? 


Exercise |.6 If |w) and |v) are orthonormal vectors, show that: 

(a) |a) = |u) + |v) is orthogonal to |b) = |u) — |v); 

(b) |c) = |u) + i |v} is orthogonal to |d) = i|u) + |v). 

Exercise |.7 Given two vectors |a) and |b), we can construct the vector 
|c) = (b|b) |a) — (bla) |b). 


Show that 
2 
(cle) = (blb) (ala) (olb) — (alb)|?), 
and hence prove the Cauchy—Schwarz inequality (Equation 1.19). a 


1.3 Real values and Hermitian operators 


In quantum mechanics, we make a distinction between entities like wave 
functions or state vectors that cannot be observed directly, and quantities like 
energy or momentum that can be measured by suitable equipment. Quantities that 
can be measured are called observables. 


Obviously, the results of measurements are described by real numbers, rather than 
complex numbers. This fact usually goes unnoticed in classical physics, which 
deals exclusively with real-valued variables, but it becomes more significant in 
quantum mechanics, since Schrédinger’s equation includes i = \/—1 and wave 
functions are generally complex. In spite of this use of complex numbers, the 
formalism of quantum mechanics must ensure that observable quantities have real 
values. In quantum mechanics, each observable A is represented by a linear 
operator A. The requirement that A has only real values imposes an additional 
constraint on this operator. We shall now see what this is. 


1.3.1 Hermitian operators 


To clarify our notation, it is helpful to picture the action of an operator A in 
quantum mechanics. Figure 1.6 is drawn in a similar spirit to the diagrams of 


1.3 Real values and Hermitian operators 


Section 1.1.3. It shows a vector |f} that represents a function 

f(a). The effect of the operator A is to change this vector into a 

new vector, A | f), representing a new function, Af (a). We can 

therefore write I>) 


Alf) =|Af). (1.32) 


Both these expressions mean the same thing; the first treats the 
operator as acting directly on the vector, while the second treats 
the operator as acting on the function f(x), which is then 


represented by the vector A fy: 


We shall assume that the observable A has only real values, 5 
which implies that the expectation value of A is always real 
in any state. From Book 1, we know that the expectation value 
of A is given by the sandwich integral Figure 1.6 A cartoon indicating the effect of 
oo a an operator A ona vector | f) in function space. 
(A) = f P*(x,t) A U(a, t) da, 
=00 


where the wave function U(x, t) describes the state of the system at the time of 
measurement. This expectation value can be expressed more compactly using 
Dirac notation, either as 


(A) = (W|AW) (1.33) 
or, using Equation 1.32, as 

(A) = (UAW). (1.34) 
The important point is that (A) must be real. In general, a complex number z is 
real if and only if z = z*, so we must have (A) = (A)*, and Equation 1.33 then 
gives 

(WAY) = (WAV) 


Hence, recalling that (g|f)" = (f|g), we conclude that 
(TAY) = (AU|W) (1.35) 
for any state Y. 


@ Write out Equation 1.35 in full, using an integral sign. 
O Explicitly, 


oO 


f * w(x, t)(AW(2,t)) de = f (ÂY (z,t)) U(x, t) de. 


=00 =00 


In other words, it does not matter whether the operator A acts on the left-hand 
W(x, t) in the sandwich integral, and the result is then complex-conjugated, or 
whether it acts on the right-hand W(,t), which is not complex-conjugated. 


Because it originates from an expectation value, Equation 1.35 involves Y in both 
slots of the Dirac bracket. However, it is generally assumed that all operators 
representing observable quantities obey an even stronger condition. First, we shall 
make a mathematical definition: 
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The restriction to normalizable 
functions is a natural one to 
make in quantum mechanics, 
which deals with normalized 
wave functions and 
eigenfunctions. 


24 


Hermitian operators 


If an operator A satisfies the condition 


(f|Ag) = (Aflg) (1.36) 


for any normalizable functions f(a) and g(x), the operator is said to be 
Hermitian. 


Then we make a sweeping physical assumption: 


Operators that represent observables 


In quantum mechanics, any observable quantity A is represented by a linear 
Hermitian operator A. 


Equation 1.36 is very important. From a physical point of view, it embodies 
the fact that measured values are real for, if we set f = g = Y, we recover 
Equation 1.35 — Hermitian operators have real expectation values. From a 
mathematical point of view, it completes the repertoire of operations that can be 
carried out with Dirac brackets. You will see that many things follow from this, 
including Ehrenfest’s theorem and the uncertainty principle. 


Exercise 1.8 Write out Equation 1.36 in full, using integral signs. m 


Let us check that some familiar operators, used to represent observables, are 
indeed Hermitian. First, consider the position operator X, which tells us to 
multiply functions by x. Because z is real, we can write 


| ” f(a) (w9(a)) de = f ” (af(a))*o(a) de. 


This means that 


(FR 9) = Flg), 


so the position operator x is Hermitian. The same conclusion applies to any 
real-valued function of x, so the potential energy operator V, which tells us to 
multiply functions by V(x), is also Hermitian. 


Now consider the momentum operator p,, = —ih 0/Ox. This operator includes a 
factor of i, so it will be interesting to see whether it satisfies the Hermitian 
condition (which, remember, delivers real expectation values). In this case, 


slo) =f (nE sa) ae, 


Wao = [T Po (in) ar 
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So 
Gatho) - UB) =in [ (FE ote) + re) GE) a 
=ih L PU a dg 
= iħ| fæ) g(@)] =0, (1.37) 


where the last step follows because normalizable functions must tend to 
zero at too. We therefore conclude that the momentum operator obeys 
(p,f\g) = (f|Pzg), and so is Hermitian. 


Surprisingly, perhaps, it is the presence of the imaginary factor, —ih, that allows If you ever found the presence 
the momentum operator to be Hermitian. The derivative operator 0/Ox, with no of i in the momentum operator 
imaginary factor, is not Hermitian. surprising, here is a good reason 


for it. 
Exercise 1.9 Show that 


(o) =e 


for all normalizable functions f(a) and g(x). Hence show that the derivative 
operator 0/Ozx is not Hermitian. E 


1.3.2 Eigenvalues and measured values 


We have seen that observables, represented by linear Hermitian operators, have 
real expectation values. Now we shall examine a more detailed point: every 
measured value of an observable A must be real. 


As a general rule, the allowed values of an observable A are found by solving an 
eigenvalue equation for the operator A. For example, the eigenvalue equation for 
energy is the time-independent Schrödinger equation 


Hyj(x) = Ei pila), 
and the eigenvalues F; are the possible energies of the system. 


More generally, we can write the eigenvalue equation for any observable A in the 
form 


A $i(z) = a; di(2), 
or, in the language of ket vectors, 

A |i) = ai |¢%). 
The functions ¢;(x) are called the eigenfunctions of A, and the corresponding 
vectors |~;) are called eigenvectors. The numbers a; are the eigenvalues, and 
these are interpreted as the possible values of A. Since A is an observable, A is 
Hermitian. Let us see what effect this has on the eigenvalues and eigenvectors. 


We write down the Hermitian condition for A, using a pair of its own 
eigenfunctions, ¢;(x) and ¢;(«): 


(4;|Agi) = (Adjl¢i). 
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Then, applying the eigenvalue equation on both sides, 


(d;|aid:) 


= (a;¢;|d%). 


Pulling out the constants a; and a; from the Dirac brackets, we get 


ai (hli) 


= a; (61d), 


which can be rearranged to give 
(ai — a;) (b;|¢i) = 0. (1.38) 


One or other of the factors (a; — až) and ($;|¢;) must be equal to zero. 


If we take j = i, we know that (¢;|¢;) 4 0, so we conclude that a; = a¥: 


The eigenvalues of a Hermitian operator are real. 


Using this fact, Equation 1.38 becomes 
(ai — aj)(b;|¢i) = 0, 

and it immediately follows that 
(¢;|¢i) =0 for a; £ aj. 


We therefore conclude that: 


Different eigenfunctions (or eigenvectors) of a Hermitian operator, 
corresponding to different eigenvalues, are orthogonal. 


Alda) = old) 


Ald.) = alé) 


Figure I.7 A cartoon indicating the effect of 


an operator A on two of its eigenvectors, |1) 
and |¢2). 


Both these results make good sense. The eigenvalues of a 
Hermitian operator A are the possible values of the observable A, 
and this whole section has been based on the premise that 

these are real. Moreover, we know from Book 1 that (1%|~);) 

is the probability amplitude of getting the eigenvalue a; 

when the system is in a state 7; in which we are certain to 

get the eigenvalue aj. If a; and a; are different, this probability 
amplitude is clearly equal to zero. 


Finally, we draw another cartoon, which summarizes our results. 
Figure 1.7 shows two of the eigenvectors |¢1) and |¢2) of 

a Hermitian operator A. The operator A does not change the 
directions of the eigenvectors, but just stretches (or contracts) 
them by the real factors a1 and a2. This is quite different from 
the effect of A ona general vector (Figure 1.6). The sketch 

also shows that the eigenvectors are orthogonal. 


1.3.3 Combining Hermitian operators 


Operators can be combined in various ways. For example, the Hamiltonian 
operator of a free particle is H = p? /2m, which is the square of the momentum 
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operator, multiplied by the real constant 1/2m. More generally, | we are interested 
in combinations such as A + B or AB. The issue is: if A and B are both 
Hermitian, are particular combinations of A and B also Hermitian? 


We can always test whether a given operator O is Hermitian by checking whether 


(f|0g) = (Of|g) 


for all normalizable functions f(x) and g(a). For example, if A is Hermitian and 
A is a real constant, we have 


(f|AAg) = A(f|Ag), because A is a constant, 
= \Af\g), because A is Hermitian, 
= (Af |g), because À is real. 


We can therefore conclude that AÂ is a Hermitian operator. Without going into 
the details, we can state the following rules of thumb: 


Rules for Hermitian operators 


If A and B are both Hermitian and A is a real constant, it can be shown that: 
e dA is Hermitian; 
e any power of A is Hermitian; 


oon + B is Hermitian. 


Using these rules, and starting from ne knowledge that X and P, are Hermitian, 
we can see that H = p,/2m+5 1 CX is Hermitian, provided that m and C are 
real constants. This is just as well, of course, because Hi is the Hamiltonian 
operator of a harmonic oscillator. 

However, not all combinations of Hermitian operators are Hermitian. To illustrate 
this point, we shall consider the product of two Hermitian operators, A and B. 
Operators act on functions placed to their right, so the meaning of the product 

AB f(z) is that we first let B act on f(z), to give Bf (x), and then let A act on the 
result. Taking A and B to be Hermitian, we therefore have 


(f|ABg) = (f|A(Bg)) = (Af|Bg) = (BA flg). (1.39) 


This equation shows that AB would be Hermitian if we could take one extra step 
and write (BA f|g) = (AB f|g) for all normalizable functions f(x) and g(x). In 
effect, this means that: 


If A and B are Hermitian, the product AB is Hermitian if and only if 
AB-—BA=0. (1.40) 


The expression on the left-hand side of Equation 1.40 is called the commutator 
of A and B. When this vanishes, we say that the two operators commute with one 
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This equation is of fundamental 
importance; you will meet it 
again later in this chapter. 
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another. This means that their ordering does not matter: AB has the same effect as 
BA. 


In practice, the ordering of operators often does matter. For example, if we let 
Xp, act on an arbitrary function f(x), we get 


Xp, f(z) = —ihx or 


But if we let p,,x act on the same function, we get something different: 


pfs) = -in 2 (ef) = —iħ (e3 + rŒ) : 


Subtracting these two equations and omitting the arbitrary function f(x) on which 
the operators act, we obtain the following commutation relation between the 
operators X and p,: 


RP, —P,X = ih. (1.41) 


This commutation relation shows that x P, and p,,x do not commute with one 
another: ordering is important in this case. 


This leaves us with a puzzle: in classical physics, xp, is a perfectly good 
dynamical variable, so we should be able to ask what is the corresponding 
operator in quantum mechanics. Neither x P, nor p,,.x will do, because they are 
not Hermitian. The correct choice turns out to be ¿(R Pa + P,X), which is 
Hermitian, as you can show in Exercise 1.11 below. This example shows that it 
is not always obvious how to make the transition from classical variables to 
quantum-mechanical operators; ultimately, a choice must be made that is justified 
by experimental results. 


Exercise 1.10 Demonstrate that A + B is Hermitian provided that A and B 
are both Hermitian. 


Exercise I.1] Show that AB + BA is Hermitian whenever A and B are 
Hermitian. Hence confirm that 4 (£ Pp, + Pa £) is Hermitian. a 


1.4 The generalized Ehrenfest theorem 


Book 1 introduced Ehrenfest’s theorem: 


d(x) _ (pz) 
dtm’ hee) 
d(pz) _ /ƏV 

dt ( Ox ) , ee 


and gave examples of its use. Ehrenfest’s theorem is of considerable interest 
because it provides a link between quantum and classical mechanics. A simple 
example is provided by a free particle, for which V (x) = 0. In this special case, 
Equation 1.43 shows that (p,,) is a constant, and Equation 1.42 then shows that 
d(x) /dt is a constant. These are the quantum-mechanical versions of the law of 
conservation of momentum for a free particle and Newton’s first law. 


This section will show where Ehrenfest’s theorem comes from. In fact, we will go 
further and derive a generalized version of Ehrenfest’s theorem, a formula for the 


1.4 The generalized Ehrenfest theorem 


rate of change of any expectation value. The main principle we shall use is 
Schrédinger’s equation, which determines the rate of change of the wave function, 
but a vital part of the argument hinges on the fact that the Hamiltonian operator is 
Hermitian. 


The expectation value of any observable A is given by 


(A) = I W* (x,t) Â T(z, t) de. (1.44) 
—co 
We shall assume that the operator A does not depend on time, so that there is no 
‘Y in the expression for A. This is true for operators such as X and P,,, and it is 
also true for the Hamiltonian operator H in an isolated system. Taking the 
derivative of Equation 1.44 with respect to time, the rate of change of (A) is 


d(A)  [* ow* ~ cee ~ OW 
ae fe i Av,tde+ | Uv (x,t) A >, de. 


The left-hand side of this equation is written with an ordinary derivative, d/dt, 
while the rest of the equation uses partial derivatives, 0/Ot. This is appropriate 
because (A) depends only on one variable, t, while the wave function in the 
integrand depends on both x and t. We shall now use Dirac brackets to write the 
right-hand side in a more compact form: 


—oo 


d(A) OW |~ ~ OW 
aL = (Aw) + (vA). 1.45 
w AY) + HAS ia 

From Schrédinger’s equation, the rate of change of the wave function is 

Ow la 

—=.-—Hv 

ðt ih’ 
so we have 

d(A) 


La jc als 
c = (fvi) + (vlâ fv). 
In the second term on the right-hand side, we have placed the Hamiltonian 
operator immediately next to W because this is what t Schrödinger’ s equation tells 
us to do. The operator A then appears to the left of H, and we must take care to 
preserve this ordering because A does not necessarily commute with f. 


Taking constant factors outside the Dirac brackets, we have Remember that constants 
d(A) es 1 p extracted from the left-hand slot 
— = —— (HU|AW) + — (W|AHY). of a Dirac bracket must be 
dt ih ih i 
2 complex-conjugated. 
Finally, we use the key fact that H is Hermitian to obtain 
d(A) 1 ~~ 1 ao 
—— = —— (W|HAW — (Y|AHY 
oo E ge 
which can be rewritten as 
d(A) 1 An AX 1 fice dik 
—— = — (W/(AH — HA)W) = — (Y| (AH — HA) |W). 
= = (YN je) = z (H ) 12) 


The combination of operators AH — HA is the commutator of A and H. In 
general, we shall use the widely-adopted shorthand notation 


[A,B] = AB- BA 
29 
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The identity operator I leaves 


all functions unchanged: that is, 


If(z) = f(z). 


The result of Exercise 1.13 is 
being used here. 
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for the commutator of any A and B. Using this notation, we conclude that 


2 (A), a% 


where the expectation values on both sides are calculated using the wave function 
W that describes the state of the system at the time of interest. 


We shall call Equation 1.46 the generalized Ehrenfest theorem. This important 
result tells us that: 


In any system, the rate of change of the expectation value of a quantity A is 
determined by the expectation value of the commutator of A with the 
Hamiltonian operator H of the system. 


The generalized Ehrenfest theorem applies to any observable A whose operator 
does not depend on time. Going back over the derivation, you can see we did not 
even use the fact that A is Hermitian, although a crucial step relied on the 
Hermitian character of H. 


Exercise |.12 An important result follows from the generalized Ehrenfest 
theorem in the special case where A= T, the identity operator. What is this result? 
Exercise 1.13 Show that [A,B + Ô] = [A,B] if [A, C] = 0. E 


1.4.1 Ehrenfest’s equations 
Ehrenfest’s first equation 


Before looking at the full implications of the generalized Ehrenfest theorem, we 
shall return to the unfinished business of justifying Equations 1.42 and 1.43. 
First, for Equation 1.42, we put A=in Equation 1.46 to obtain 

dig) Dope Ss 

ai R fil). 1.47 

dt ih ( pon sae 

So we need to evaluate the commutator of £ with H. We shall assume that the 
Hamiltonian operator takes the usual form 


pa a2 A2 1 
R, f] =3 Pe - Peg — fR, p]. (1.48) 


The details of working out the remaining commutator are discussed in the 
following worked example. 
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Worked Example 1.2 Essential skill 
Simplify the commutator [X, p2]. Evaluating a commutator 
Solution 


There are two different ways of answering this question. 


General method: We can always write the commutator in terms of 
explicit operators, let it act on an arbitrary function f(a), and simplify the 
result. In the present case, ¥ = x and p2 = —h? 0?/Ax?, so we obtain 


2 2 
[83] f(e) = r (2 ar- Fa (esto). 


The derivative in the second term on the right-hand side is 


e a Fi of 


So 
ee o 
EO 


Since this equation is true for all f(x), we can write it as a relationship 
between operators: 


io, on? = 2ih (-in x) =p. (1.49) 


Alternative method: An alternative method can be used for any 
commutator involving powers of X and powers of P}. The idea is to write out 
the commutator in full, 


x 2D) ABE ZB PR E 

Ee P] = X PPr — PzPzž; 
and then use the known commutation relation Xp, — P43 = ih 
(Equation 1.41) to achieve the same ordering in both terms. This gives 


[&, Bz] = (Ba + iA)B, — Be RB, — ih) = 2AB,, (1.50) 


as before. 


Combining the result of this worked example with Equations 1.47 and 1.48, we 
conclude that 


dt ih 
which confirms Ehrenfest’s first equation (Equation 1.42). 


~ ih 2m m’ 


d(z) 1 (kâ) = 1 2iñ(ps) _ (pe) 


Ehrenfest’s second equation 


Ehrenfest’s second equation can be derived in a similar way. Putting A= p, in 
Equation 1.46 and noting that the momentum operator p,, commutes with the 
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kinetic energy operator p2/2m, we have 


Ate) — = (Bef) = 5 (Be, PE) (151) 


The following exercise asks you to fill in the remaining details. 


Exercise 1.14 Show that 
> © ., OV 
[Pz V(2)] = —ih Or’ 
and hence derive Ehrenfest’s second equation (Equation 1.43), 


tg) = -(@) l 


1.4.2 Conservation laws 


(1.52) 


The generalized Ehrenfest theorem links the rate of change of (A) to the 
commutator of A with H. The simplest possibility is for A to commute with H, so 
that the commutator [A, H] is equal to zero. In this case, 


d(A) 
dt 
and (A) remains constant in time, no matter what state the system is in. We shall 
now consider some examples of this behaviour. 


=0, 


Conservation of energy 


The Hamiltonian operator obviously commutes with itself: HH — HH = 0. So we 
have 


d(H) 

Go (1.53) 
The observable corresponding to the Hamiltonian operator is the energy of the 
system, so Equation 1.53 tells us that the expectation value of the energy remains 
constant in time. This conclusion is based on the generalized Ehrenfest theorem, 
which assumes that the operator under discussion ast in this case) does not depend 
on time. This is a reasonable assumption, provided that the system is isolated, so 
that it is not subject to time-dependent influences. We are therefore led to the 
following quantum-mechanical version of the conservation of energy: 


The expectation value of the energy of any isolated system remains constant 
in time. 


Exercise |.15 Does the uncertainty in energy depend on time in an isolated 
A2 a 
system? Hint: Does H commute with H? a 
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Conservation of momentum 


We now consider an isolated system of two interacting particles. In classical 
mechanics, we would expect the total momentum of such a system to remain 
fixed. Let us see what happens in quantum mechanics. 


As usual, we simplify the analysis by restricting to one dimension (the 
z-direction). We assume that the Hamiltonian operator of the isolated two-particle 
system takes the form 
a2 a2 
A= 71 +2 4 (2,2), (1.54) 
2m, 2m 

where the subscripts 1 and 2 label the two particles. For clarity, we have dropped 
the subscript x from the momentum operators, but it is understood that these 
operators refer to momenta in the x-direction. Thus, 

X ih o 43 ih 

= —ih — an = —ih —. 

Py Ox i P2 ð £2 
The total momentum of the two-particle system is represented by the operator 
Pi + Po, and the rate of change of the expectation value of the total momentum is 


d 1 ~ il de A 
a = (fp Be, A]) = (Pf Pa, fi] ). 
gy (pi + P2) = = ( [Bi + Bo, Hi] ) = 5 (Pa f] + [pa f 
The momentum operators p, and py commute with both of the kinetic energy 


operators P? /2m, and p3/2mz (because the ordering of different partial 
differentiations does not matter). So we have 


d lye S -sS 
g” + p2) = ih ( [P9] F [pe V]), 


and Equation 1.52 gives 
a, + ) = — ov + ov 
dt Bi parm Ox Ox 


The potential energy function can be regarded as a function of the single variable 
z = T1 — T2, SO 


ea ee ea a 
Ox, dz Ox a 2) “dz i 
əðV dav a _ av 


Or. dz a Ox le) dz Xim 


Hence we conclude that 


d 
g” + p2) =0. (1.55) 


This is the quantum-mechanical version of the law of conservation of momentum. 
The conclusion relies heavily on our assumption that the potential energy function 
takes the form V (zı — x2). This assumption makes good sense for an isolated 
system because it means that the potential energy of the system depends only 

on the relative positions of its particles, not on their positions with respect to 
anything else. 
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Figure 1.8 Emmy Noether 
(1882-1935) proved theorems 
that established a link between 
symmetries and conservation 
laws. 


Figure 1.9 Murray 
Gell-Mann (1929-) proposed 
the existence of quarks on the 
basis of symmetry arguments. 
Gell-Mann won the 1969 Nobel 
prize for physics. 
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Conservation laws and symmetry 


The above examples give a glimpse into a powerful way of thinking about 
conservation laws. We know that the expectation value of any observable A is 
conserved if the operator A commutes with H for the system under discussion. 
When thinking about conservation laws, our attention therefore turns to the form 
of the Hamiltonian operator. 


To derive the quantum-mechanical version of energy conservation, we assumed 
that the Hamiltonian of an isolated system is independent of time. To derive 
the quantum-mechanical version of momentum conservation, we assumed 

that the Hamiltonian of an isolated two-particle system depends only on the 
relative coordinates of the particles; this implies that it does not depend on the 
centre-of-mass coordinate, which tells us where the system is in space. 


These assumptions can be expressed in terms of symmetry. We say that there is a 
symmetry if a given action does not change things. The fact that the Hamiltonian 
operator is independent of time means that it is symmetric under translations in 
time. The fact that it is independent of the centre-of-mass coordinate means that it 
is symmetric under translations in space. In general, wherever there is a symmetry 
in physics, there is a corresponding conservation law. In the next chapter you will 
see that the lack of a special direction in space leads to the law of conservation of 
angular momentum. Even the conservation of charge is related to a symmetry 
(known as gauge invariance). 


The link between symmetries and conservation laws pervades both classical and 
quantum physics. In classical physics, this link was explored extensively by 
Emmy Noether in 1918 (Figure 1.8). Noether was one of the first women to make 
an indelible mark on physics. This is not surprising, given the prejudices of 

her day; Noether faced considerable opposition, and had to overcome rules 
preventing women from enrolling on courses or giving lectures. Later generations 
of physicists exploited Noether’s ideas in the context of quantum physics, and 
especially particle physics. For example, Murray Gell-Mann (Figure 1.9) used 
symmetry arguments to predict the existence and mass of a new particle, and to 
inspire the idea that protons and neutrons contain quarks. 


1.5 The generalized uncertainty principle 


1.5.1 A more general uncertainty principle 


In Book 1 you met the Heisenberg uncertainty principle, which tells us that the 
product of the uncertainties in x and py in any state must be at least as large 
as h/2: 


h 
Az Apr > 5" (1.56) 


This principle denies us the possibility of knowing both the position and 
momentum of a particle. It forces us to abandon the idea that particles move along 
definite trajectories, and so finally demolishes old models of atoms in which 
electrons orbit the nucleus like planets going around the Sun. 


1.5 The generalized uncertainty principle 


Heisenberg proposed his uncertainty principle in 1927, but gave no rigorous 
proof. Over the next year or so, other physicists filled in the gaps in Heisenberg’s 
reasoning. Then, in 1929, Howard Robertson realized that the Heisenberg 
uncertainty principle is a special case of a more general inequality that applies to 
all observables. This generalized uncertainty principle states that 


AAAB > 3|((A,B])), (1.57) 


where AA and AB are the uncertainties of any observables A and B in a given 
state, and the right-hand side involves the expectation value of the commutator of 
A and B in the same state. 


It is easy to see that the generalized uncertainty principle reduces to the 
Heisenberg uncertainty principle when A = x and B = py, for we then have 


[De] = iħ, (Eqn 1.41) 
which, when used in Equation 1.57, leads back to Equation 1.56. 
Exercise |.16 What restriction does the generalized uncertainty principle 


place on the ‘mixed’ uncertainty product Az Apy? 


Exercise |.17 | Combine the generalized uncertainty principle with the 
generalized Ehrenfest theorem to show that the rate of change of the expectation 
value of any observable A must obey the inequality 


d(A) 2 
——|< ~AAAE 1. 
dt |7 A ? e 
where F is the energy of the system. a 


In a stationary state, the energy has a definite value, so AE = 0. Equation 1.58 
then shows that |d(A) /dt| = 0. So the expectation value of any observable 
remains constant in a stationary state. This is a result you met in Book 1, and is a 
good reason to call these states stationary. 


However, the static character of stationary states should not be confused with the 
conservation laws we described earlier. Conservation laws apply to observables 
whose operators commute with the Hamiltonian operator. If this occurs for an 
observable A in a given system, the expectation value of A will remain constant in 
all states of the system, whether they are stationary or not. 


1.5.2 Proving the generalized uncertainty principle 


Finally, we prove the generalized uncertainty principle. Please note that this 
proof will not be assessed or examined. However, you are advised to follow 
it through. You will see that the uncertainty principle is not an independent 
assumption, but follows directly from very basic principles of quantum 
mechanics; this is a major success for the methods introduced in this chapter. 
The proof will also give you useful practice at manipulating Hermitian 
operators. 
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The Cauchy—Schwarz inequality 
was proved in Exercise 1.7. 
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Our starting point for proving Equation 1.57 is the Cauchy—Schwarz inequality 
(ala) (blb) > (alb). (Eqn 1.19) 


For our purposes, it is helpful to express this in a slightly different form. We note 
that for any complex number z = Re(z) + ilm(z), 


|z? > |Im(z)|° = 


Now, we can set z = (a|b) and z* = (a|b)* = (b|a) in this inequality to obtain 
2 2 
|(ald) |" > Z| (alb) — (bla)|". 
Combining this with the Cauchy—Schwarz inequality, we obtain 
(ala) (olb) > 3|(alb) = (bla) ?. (1.59) 


This inequality is valid for any vectors |a) and |b). We have not yet made any 
connection with the uncertainty principle — but we are about to do so. 


We consider a system in a state described by the vector |Y}, and two observables, 
A and B, that can be measured in this system. The observables are represented by 
linear Hermitian operators A and B, and we can introduce the vectors 


la) = AW) = |Aw), 
|b) = B |Y) = |BW). 
Inserting these vectors into Equation 1.59, we obtain 


(AV|AW)(BY|BW) > 1\(AW|BW) — (BY|AW)|?. (1.60) 


Now, the crucial point is that the operators A and B are Hermitian, and this allows 
us to move them from the left-hand slot of a Dirac bracket to the right-hand slot. 
Doing this throughout Equation 1.60 gives 


(TAY) (v| BBY) > 1|(W|ABW) — (W|BAW)|’, 
which can be written as 
ao ae 2 
(w| A” vyv |W) > iK¥] (AB — BA) |W)]°. 


The quantities (Y| --- |W) appearing in this inequality are all expectation values in 
the state Y, so we have 


eo a 2 
(A?) (B?) > 3|((A,B])] (1.61) 
This is very close to the generalized uncertainty principle. You may recall from 
Book 1 that the squares of the uncertainties of A and B are given by 
(AA)? = (A?) — (A)* = ((A— (A))?), (1.62) 
(AB)? = (B?) — (B)? = ((B — (B))’). (1.63) 
So, in the special case where (A) = (B) = 0, Equation 1.61 becomes 


(AA)? (AB)? > iKa 


Summary of Chapter | 


and the generalized uncertainty principle follows on taking the square root of both 
sides. 


The only remaining step is to show that the same conclusion applies when (A) 
and (B) are non-zero. The key point is that Equation 1.61 is valid for any pair of 
Hermitian operators. We can therefore replace A and B by other Hermitian 
operators, chosen to make the left-hand side as small as possible. 


Now, it is easy to see that A- (A) is a Hermitian operator, since it is the 
difference of two Hermitian operators (A and the operator telling us to multiply 
by the real number (A)). For similar reasons, B — (B) is Hermitian. We can 
therefore obtain a valid inequality by making the replacements 


A= A-(A) and B= B-(B) 
consistently throughout Equation 1.61. This gives 
((A~ (A))?) ((B - (B))?) > IKA - (4), 8 - (By). 


Taking the square root of both sides and using the definition of uncertainty 
(Equations 1.62 and 1.63), we obtain 


AAAB > }|([Â - {4),Ê - (B)])]. (1.64) 


The final step is to simplify the commutator on the right-hand side. You can do 
this in the following exercise. 


Exercise 1.18 Given two linear operators A and B, show that 
[A — (A), B - (B)] = [A,B], 


and hence complete the proof of the generalized uncertainty principle. E 


Summary of Chapter | 


Section 1.1 The state of a quantum system can be represented by a ket vector in 
an abstract vector space called function space. For wave mechanics in one 
dimension, the inner product is given by 


Co 
(tl) = f FE slede 
—oo 
This inner product is a complex number with the properties 


(flo =(glf), (flea) =e(flg) and (cflg) = č (fg), 


and it obeys the inequalities 
(FI 20 and (FIAI) = KEP. 


Section 1.2 The Dirac bracket (f|g) can be regarded as a joining together of a 
bra vector (f| and a ket vector |g). It is important to remember that the ket vector 
|g) = J; ci |gi) corresponds to the bra vector (g| = >>, ci (gil, and vice versa. 
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Section 1.3 Observable quantities are represented by linear Hermitian operators. 
By definition, an operator A is Hermitian if 


(Af |g) = (f|Ag) 


for all normalizable functions f and g. Hermitian operators have real expectation 
values and real eigenvalues. If two eigenfunctions (or eigenvectors) of a 
Hermitian operator correspond to different eigenvalues, they are orthogonal. 


fA and B are Hermitian, the product AB i 1S Hermitian if and only if A commutes 
with B. Any power of A is Hermitian, AB + BA is Hermitian, and any linear 
combination aÂ + BBi is Hermitian provided that the constants a and 8 are real. 


Section 1.4 The generalized Ehrenfest theorem states that the rate of change of 
the expectation value of an observable is 


w = = (Afi), 


where H is the Hamiltonian operator of the system, and the expectation values on 
both sides of the equation are calculated for the same state. If A commutes 

with f, the expectation value of A remains constant in time, no matter what state 
the system is in. This leads to the quantum-mechanical versions of the laws of 
conservation of energy and momentum. Such conservation laws can be related to 
symmetries of the system. 


Section 1.5 The generalized uncertainty principle states that 
AAAB > 3|((A,B])), 


where AA and AB are the uncertainties of any two observables A and B in 
a given state, and the right- -hand side involves the expectation value of the 
commutator of A and B in the given state. 


Achievements from Chapter | 


After studying this chapter, you should be able to: 
1.1 Explain the meanings of the newly defined (emboldened) terms and 
symbols, and use them appropriately. 


1.2 Explain why it is appropriate to represent a quantum state by a vector in a 
vector space. 


1.3 Use Dirac brackets and bra and ket vectors in simple calculations. 

1.4 State the properties of Hermitian operators and use them in calculations. 
1.5 Evaluate the commutator of a given pair of operators. 

1.6 State and apply the generalized Ehrenfest theorem. 

1.7 Discuss the relationship between symmetries and conservation laws. 


1.8 State and apply the generalized uncertainty principle. 


Chapter 2 Introduction to angular momentum 


Introduction 


This chapter is an introduction to angular momentum in quantum mechanics. 
You may know something about this concept from your studies of classical 
mechanics. A rotating wheel, a spinning ball and an orbiting planet all have 
angular momentum. In many circumstances, the angular momentum of a system 
is conserved, remaining constant in time. For example, a planet in orbit around 
the Sun has a constant angular momentum; this explains why the planet has a 
planar orbit, and why it sweeps out equal areas in equal times (one of Kepler’s 
laws of planetary motion). 


Angular momentum also plays a vital role in microscopic systems such as atoms 
and molecules. Some states of a hydrogen atom or a hydrogen chloride molecule 
have angular momentum, and so do protons and electrons. Of course, we cannot 
observe the motion of an electron by viewing it through a microscope, so you 
might wonder how we can know anything at all about its angular momentum. 
Fortunately, there is a close link between the angular momentum of a particle 
and its magnetic dipole moment (the quantity that determines how the particle 
interacts with a magnetic field). We can therefore learn a lot about angular 
momentum by observing how particles respond to applied magnetic fields. This 
formed the basis of an experiment carried out by Stern and Gerlach in 1922, 
which led to the conclusion that the angular momentum of an atom is quantized. 


A dozen years before quantum mechanics was established, Bohr proposed a 
semi-quantum model of a hydrogen atom in which he treated the orbiting electron 
rather like a planet in orbit around the Sun, except that he assumed that the 
angular momentum of the electron would be quantized. Bohr’s model was not 
satisfactory, but he was right about the quantization of angular momentum. The 
challenge for quantum mechanics is to explain this fact. Here, you will see how 
this is done using linear operators and eigenvalue equations. This is the first step 
towards developing the quantum-mechanical theory of angular momentum. 


The chapter is organized as follows. Section 2.1 reviews the classical physics of 
angular momentum, both for moving particles and for rotating rigid bodies. 
Section 2.2 uses classical notions to establish a link between angular momentum 
and the magnetic dipole moment. This leads to a description of an experiment 
that provided convincing evidence for the quantization of angular momentum. 
Section 2.3 uses quantum-mechanical principles to show that the Cartesian 
components of angular momentum are quantized in units of A. This section also 
describes how the magnitude of the angular momentum is quantized, and uses 
this to interpret the spectra of rotating molecules. Section 2.4 presents the 
quantum-mechanical version of the law of conservation of angular momentum 
and relates it to the generalized Ehrenfest theorem. Section 2.5 then shows that 
different components of angular momentum obey an uncertainty relation. In 
general, this makes it impossible to find states in which two different components 
of angular momentum both have definite values. However, in situations where 
angular momentum is conserved, it is possible to find states where the energy, 
one component of the angular momentum and the magnitude of the angular 
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O 


Figure 2.1 A particle P is in 
is motion about a fixed origin O; 
its displacement from O is r, and 
its momentum is p = mdr /dt. 


Vector products and the 
right-hand rule are discussed 
in Section 8.1.4 of the 
Mathematical toolkit. 
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momentum all have definite values. This has important consequences for the way 
quantum states are labelled in atoms. 


The present chapter is only an introduction to angular momentum in quantum 
mechanics. It deals with orbital angular momentum — angular momentum that 
is associated with moving particles or with rotating bodies. This is the type of 
angular momentum we are familiar with in classical physics. However, at the end 
of the chapter we shall point the way to another type of angular momentum, called 
spin angular momentum, or spin for short. This type of angular momentum is an 
intrinsic property of certain particles irrespective of any motion they may have. 
For example, an electron would have spin angular momentum, even if it were 
stationary. Spin is a purely quantum-mechanical concept which, in spite of its 
name, cannot be visualized in terms of particles spinning on their axes. The 
chapter that follows this one is all about spin. Indeed, one of the reasons for 
discussing orbital angular momentum now is that it will help us understand 

spin, which is a crucial ingredient for later chapters in the book, dealing with 
entanglement and the interpretation of quantum mechanics. Later, in Book 3, there 
will be yet another chapter devoted to angular momentum, in which we discuss 
aspects that are especially relevant for the description of atoms and molecules. 


2.1 Review of classical angular momentum 


2.1.1 The angular momentum of moving particles 


Figure 2.1 shows a particle P, of mass m, moving relative to a fixed origin, O. At 
a given time, the particle’s displacement from O is r, and its momentum is 

p = mdr/dt. In general, both these quantities depend on time, although we shall 
not indicate this in our notation. In classical physics, at any given instant, the 
orbital angular momentum of the particle about O is defined to be 


lbp == iP SX fe (2.1) 


For brevity, we shall often refer to this as the angular momentum of the particle. 


The appearance of a vector product in Equation 2.1 is significant. It means that 
the magnitude of the angular momentum is given by L = rpsin 0, where @ is the 
angle between the directions of the vectors r and p marked in Figure 2.1. It also 
means that L is a vector perpendicular to both r and p. The precise direction of L 
is fixed by the right-hand rule; in the situation shown in Figure 2.1, this implies 
that the angular momentum is directed out of the page, towards you. 


Exercise 2.| A particle of mass m and constant speed v performs uniform 

circular motion of radius r in a horizontal plane. Viewed from above, the motion 
is clockwise. Describe the magnitude and direction of the angular momentum of 
this particle about an origin at the centre of its circular path. | 


Many physical concepts (such as energy and momentum) are important, in 
part, because they are subject to conservation laws, and this is true of angular 
momentum. For example, in the situation shown in Figure 2.1, we can consider 
what happens when the force acting on the particle is a central force — which 
implies that it always acts along the line joining the particle to the fixed point O. 


2.1 Review of classical angular momentum 


This would be true for a planet experiencing the gravitational tug of a star, or an 
electron experiencing the electrostatic attraction of a proton. In general, the rate of 
change of the angular momentum about O is given by 

T SEXP) Lax ptr xp. (2.2) 
However, p is just mr, so the first term is the vector product of two parallel 
vectors, and must be zero. Using Newton’s second law in the form F = p, we can 
write the second term as r X F. Provided that the force is central, this is also a 
vector product of two parallel (or antiparallel) vectors, and so is equal to zero. We 
conclude that the angular momentum of a particle subject to a central force 
remains constant: angular momentum is conserved. 


The conservation of angular momentum has many applications in astronomy. For 
example, the gravitational force of the Sun on a planet is central, so the angular 
momentum of the planet is conserved. It can be shown that this implies that the 
radius vector joining the Sun to the planet sweeps out equal areas in equal times, a 
fact known as Kepler’s second law (Figure 2.2). 


We use dot notation for 
differentiation with respect to 
time: t = dr/dt. 


Figure 2.2  Kepler’s second law. The white Figure 2.3 If the plane of the Earth’s orbit did 
areas, corresponding to equal intervals of time, not include the Sun, the angular momentum L of 
are equal. the Earth would point in different directions at 


different times, e.g. at two points E; and Eo. 


@ Use the conservation of angular momentum to show that the Sun must lie in 
the orbital plane of a planet. 


O If the Sun were not in the orbital plane of the planet, the radius vector r from 
the Sun to the planet would lie on something like a cone; the angular 
momentum L is always normal to r, and would therefore vary in direction as 
the planet travelled around its orbit (Figure 2.3). This is impossible because 
the angular momentum of a planet is conserved, so the vector L must be 
constant in both magnitude and direction. 


We often need the Cartesian components of the angular momentum vector. An 
easy-to-remember way of writing these down is to express the vector product in 
terms of a determinant, as follows: 


er ey By 
L=rtxp=jz y zi, (2.3) 
Px Py Pz 


Section 8.3.5 of the 
Mathematical toolkit gives a 
review of determinants. 
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This procedure of cycling the 
subscripts is sometimes referred 
to as a cyclic permutation. 


Figure 2.4 A wheel rotating 
about a fixed axle can be 
considered to be made up of 
many small elements each in 
orbit about the axle. 
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where ez, €, and e, are Cartesian unit vectors. Expanding this determinant, we 
see that 


L = L,e, + Lye, + Lez, (2.4) 
where 

Ly = Ypz — ZPy, (2.5) 

Ly = zpr — LPz, (2.6) 

Lz = py — Yprz- (2.7) 


All of the expressions for Ly, etc. can be obtained from any other one by cycling 
the indices wherever they appear: x => y => z => zx. In this way, for example, 
we get L, by replacing z by x, x by y and y by z in the expression for Lz. 


Exercise 2.2 Use Equation 2.3 to confirm that Ly is given by 
Equation 2.6. a 


2.1.2 Angular momentum of rotating rigid bodies 


So far we have discussed the angular momentum of particles, but classical physics 
is also concerned with the angular momentum of extended bodies, such as wheels 
and boomerangs, and quantum physics deals with the angular momentum of 
rotating molecules or atomic nuclei. We can restrict the discussion to the rotation 
of a rigid body about a fixed axis — the rotation of a wheel about a fixed axle, for 
example (Figure 2.4). 


Equation 2.1 applies to each particle in an extended body, and the total angular 
momentum of the body is found by adding together the angular momenta of its 
particles. This is a vector sum, but if the body rotates about a fixed axis, all 
contributions to the angular momentum are in the same direction, along the axis of 
rotation. For a rigid body, each particle has the same angular speed of rotation, w. 
It can then be shown that the whole body has an angular momentum of magnitude 


L=Iw, (2.8) 


where the constant J is the body’s moment of inertia about the given axis of 
rotation, and w is the angular speed of rotation. The moment of inertia of the body 
is given by J = )> mid? , where m; and d; are the mass and distance from the axis 
of rotation of particle 7, and the sum is taken over all the particles in the body. 


We can also add up the kinetic energies of all the particles in the body to obtain 
the rotational kinetic energy, Erot, of the body. This turns out to be 


Eo = How’. (2.9) 
This expression is correct, but is not in a form suitable for quantizing. When we 


come to quantize the rotational kinetic energy of a diatomic molecule, we will 
find it more convenient to combine Equations 2.8 and 2.9 and write 


L? 


rot = g (2.10) 


This is reminiscent of the fact that the kinetic energy of a free particle is best 


expressed as p*/2m in quantum mechanics, rather than as imo. 


2.2 The Stern—Gerlach experiment 


2.2 The Stern—Gerlach experiment 


We cannot directly observe the rotations of atoms, so it is not clear how we can 
measure their angular momenta. However, it turns out that many atoms behave 
like tiny magnets. The magnetic properties of an atom can be characterized by a 
quantity called the magnetic dipole moment which turns out to be proportional to 
the angular momentum of the atom. Hence we can find out about the angular 
momentum of an atom by measuring its magnetic dipole moment. 


This is the principle behind a ground-breaking experiment carried out by Otto 
Stern and Walther Gerlach in 1922. Stern and Gerlach demonstrated that the 
magnetic dipole moments of atoms are quantized, which is tantamount to showing 
that their angular momentum is quantized. Before we describe this famous 
experiment, we shall introduce the concept of magnetic dipole moment, and show 
how it is related to angular momentum. 


2.2.1 Magnetic dipoles in magnetic fields 


Anyone who has undergone an MRI scan has benefited from the fact that some 
atomic nuclei behave like tiny magnets, called magnetic dipoles. Magnetism is a 
pervasive property in the microscopic world, but it is convenient to introduce the 
concepts we need in the familiar context of classical physics. A simple example 
of a magnetic dipole is provided by a small circular loop of wire carrying a 
steady electric current. Such a magnetic dipole has some magnetic properties — 
for example, it produces a magnetic field and it responds to an externally-applied 


magnetic field. Provided that the area of the loop is small, the magnetic properties is 

of the loop all depend on a single vector quantity, u, called the magnetic dipole i I 

moment of the loop. The magnitude of the magnetic dipole moment is ae Ce 
u= IA, (2.11) aa 

where J is the current through the loop and A is the area of the loop (see yy 

Figure 2.5). The direction of the magnetic dipole moment is perpendicular to the 

plane of the loop in the sense defined by the right-hand grip rule. This rule Figure 2.5 A circular loop of 

involves curling the fingers of the right hand in the direction of current flow wire of area A, carrying a 

around the loop; the extended right thumb then indicates the direction of the current J, has a magnetic 

magnetic dipole moment. It is convenient to introduce the oriented area A, dipole moment of magnitude 

which is a vector quantity of magnitude A, pointing in a direction perpendicular u = IA (or NIA if the loop has 

to the area of the loop in the sense defined by the right-hand grip rule. The N turns). The vector p is in the 

magnetic dipole moment of the current loop can then be expressed as directions shown, in agreement 


pe (2.12) with the right-hand grip rule. 
Not all magnetic dipoles can be visualized as current loops; a compass needle is a 
good example to keep in mind. Like a compass needle, any magnetic dipole 
responds to an external magnetic field as follows: when placed in a magnetic 
field, a magnetic dipole will experience a torque whose direction is that which 
would align a stationary magnetic dipole with the magnetic field. This torque is 
given by the expression 


TrT=pux bB, (2.13) 
which is zero when yp and B are parallel. 
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Alternatively, we can think in terms of energy. The magnetic dipole has a 
potential energy in the magnetic field given by 


Emag = -u : B. (2.14) 


This has its smallest value — uB when p is parallel to B, so it is energetically 
favourable for the magnetic dipole moment to align with the field. 


A magnetic dipole in a uniform magnetic field, such as the Earth’s field in a local 
region, feels a torque tending to align it in the direction of the field, but it feels no 
force pulling it as a whole in any direction. However, the magnetic dipole does 
feel a force in a non-uniform magnetic field. The force experienced by a magnetic 
dipole in a non-uniform magnetic field is minus the gradient of the potential 
energy of the magnetic dipole. If the magnetic field points in the z-direction, the 
z-component of the force is 
OE mag oB, 

no a aa 
In a uniform magnetic field, this force is equal to zero, though the torque on the 
magnetic dipole need not be equal to zero, as we have seen. However, if the 
magnetic field is non-uniform, the magnetic dipole will feel a net force pulling it 
in a direction that depends on the relative orientation of m and B. If the magnetic 
dipole moment is roughly parallel to the magnetic field gradient, it is drawn 
towards regions of greater field strength; if the magnetic dipole moment is 
roughly antiparallel to the magnetic field gradient, it is drawn towards regions of 
lesser field strength. 


C15) 


On a microscopic scale, many atoms and nuclei behave as magnetic dipoles. 
You will see that the force acting on an atom in a non-uniform magnetic field 
(Equation 2.15) is a crucial element in the Stern—Gerlach experiment discussed 
later in this section. 


The SI unit of magnetic dipole moment: Alternative SI units for magnetic 
dipole moment can be seen by analyzing Equations 2.12 and 2.14. The first 
gives units of amperes times square metres (A m?), while the second gives 
units of joules per tesla (J T~!); these units are equivalent and can be used 
interchangeably. 


2.2.2 Magnetic dipole moments and angular momentum 


The link between magnetic dipole moments and angular momenta can be 
illustrated by a simple classical model. We consider a particle of charge q and 
mass m, moving at constant speed v around a circle of radius r. This particle has 
an angular momentum about the centre of the circle of magnitude L = mur. 
Because the particle is charged, it also carries a current around the circle, and so 
produces a magnetic dipole moment whose magnitude we shall now calculate. 


The current is equal to the charge per unit time that passes a fixed point. This is 
equal to the charge q of the particle divided by the time it takes to complete one 
lap, T = 2ar/v. So the current has magnitude |q|/T = |gq| v/27r, and the 
magnetic dipole moment has magnitude 


= a x wr? = Slqler. 


2.2 The Stern—Gerlach experiment 


A particle in circular motion has an angular momentum of magnitude L = mur, 
so we have 


lq] 
= —L. 
j 2m 
For q > 0, the magnetic dipole moment vector points in the same direction as the 
angular momentum vector. For q < 0, the magnetic dipole moment vector points 
in the opposite direction to the angular momentum vector. We therefore have the 
vector equation 


q 
=L. 
k 2m 
The important point here is that the magnetic dipole moment and angular 
momentum vectors are proportional to one another, and the proportionality 
constant depends only on q and m, intrinsic properties of the orbiting particle. 


In general, 
w= YL, (2.16) 


where the proportionality constant y is called the gyromagnetic ratio. This is 
equal to q/2m for the classical orbiting particle considered above. Other rotating 
bodies may have different values of y — for example, a rotating compact disc 
with an extra electron on its rim has a very small ratio of magnetic moment to 
angular momentum, so its gyromagnetic ratio is much smaller than the magnitude 
of q/2m for an electron. 


The above arguments are based on classical physics. In quantum mechanics, 

we cannot retain the picture of an electron circulating around a fixed orbit. 
Nevertheless, it remains true that many atoms have magnetic moments and 
angular momenta, and these two quantities can be related by Equation 2.16 for 
some choice of gyromagnetic ratio. So if an experiment shows that the magnetic 
dipole moment of an atom is quantized, we can infer that the angular momentum 
of the atom is quantized too. We now describe such an experiment. 


2.2.3 The Stern—Gerlach experiment: jz is quantized 


The famous experiment carried out by Stern and Gerlach in 1922 was based on 
the fact that a magnetic dipole experiences a force in a non-uniform magnetic 
field. Stern and Gerlach constructed a magnet with specially-shaped pole pieces, 
designed to produce a strongly non-uniform magnetic field along the path of a 
beam of atoms (Figure 2.6). 


In the experiment carried out by Stern and Gerlach, the beam was one of silver 
atoms, created by heating silver to a high temperature in an enclosure (an ‘oven’) 
with a small hole. Silver atoms emerging from the hole were collimated into a fine 
beam which was directed through the magnetic field, the whole experiment being 
carried out in a very high vacuum. After the silver atoms had passed through the 
magnetic field, they were detected by letting them fall onto a glass plate, building 
up a visible deposit of silver in the places where many atoms fell. 
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shaped 
pole pieces 


atomic beam 
from oven 
and collimator 


(a) 


Figure 2.6 (a) A perspective 
view of a Stern—Gerlach magnet. 
(b) Cross-section for a 

fixed value of y through a 
non-uniform magnetic field 
between the specially-shaped 
pole pieces. The direction of the 
magnetic field B at any point is 
the direction of the field line at 
that point, and the magnitude of 
the magnetic field is greater 
where the field lines are closer 
together. 
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Suppose that an atom in the beam with magnetic dipole moment u passes through 
the point O in Figure 2.6b, travelling in the y-direction, normal to the plane of the 
page. Along this path, the atom experiences a steady force in the z-direction given 
by 
OB 

F, = Hz ra . 
Since 0B,/Oz < 0 in Figure 2.6, an atom with a positive value of u, will be 
deflected downwards, and an atom with a negative value of u, will be deflected 
upwards. 


(Eqn 2.15) 


In classical terms, we would expect the silver atoms to have all orientations of 
magnetic dipole moment, so the forces on the atoms would be expected to be in 
the z-direction, with values spread throughout the range 


OB: OB: 
H Oz 


Atoms with u directed fully downwards would be deflected most strongly 
upwards, while atoms with jz directed fully upwards would be deflected most 
strongly downwards. Most atoms would end up somewhere in between, randomly 
distributed, with many being deflected very little. The glass plate would therefore 
be expected to exhibit a continuous smudge, showing a continuous range of 
deflections suffered by the silver atoms. Parts (a) and (b) of Figure 2.7 show what 
might have been expected in classical physics, without, and then with, the 
non-uniform magnetic field. 


< Roce 


(2.17) 


= 
Q 
x 


2.2 The Stern—Gerlach experiment 


Figure 2.7 (a) In the absence of a magnetic field, the 
atoms pass straight through the apparatus, leaving a spot 
the size of the collimated beam. (b) In a non-uniform 
magnetic field, atoms are deflected up or down depending 
on the value of yz. According to classical physics, all 
orientations of u are possible, so the atoms would leave a 
continuous smudge. (c) In quantum physics, only a 
discrete set of values of ju, is allowed, and the atoms 

G) (b) © leave n spots. The case n = 2, appropriate for silver 
atoms, is shown. 


Something very different is observed in practice. Stern and Gerlach found that 
half of the silver atoms are deflected upwards by some amount, while the other 
half are deflected downwards by the same amount. In an ideal case, this gives two 
clear spots on the glass plate, as shown in Figure 2.7c. 


In practice, the beam may be so wide in the x-direction that different atoms 
experience different magnetic fields. Atoms passing through the edge of the field 
will be deflected less than those passing through the middle. The trace formed on 
the glass plate is therefore like the shape of two lips, as indicated in the upper 
portion of the plaque reproduced in Figure 2.8. The small graph to the side of this 
image shows the number of atoms deflected to various values of z above and 
below the straight-through direction (x = 0, z = 0). It shows just two peaks. The 
width of these peaks reflects the fact that the atoms emerging from the oven have 
the usual thermal distribution of speeds, since the deflection depends on the speed 
as well as the deflecting force. More elaborate experiments, arranged so that all of 
the atoms have almost the same speed, give two very narrow peaks. 


ia resRUAR TE? WURDE N DETA GE DES 

PHYSREALSOHEN VEREINS, FRARNKEURT AA AIA, 

VON OTTO STERN UNO WALTHER GERLACH DE 
FUNDAMENTALE ENTOECKUNIG DER 


ALF DEM STERR- "GERLACH: EXPERIMENT 
pb ete eS ri 
WIE KERNS TMETHODE. ATO! 
ae STERN WURDE Wad 
(DER INSBELPRELS. 


Figure 2.8 Plaque celebrating the experiment of Otto Stern (1888-1969) and 
Walther Gerlach (1889-1979). Stern (on the left) was awarded the 1943 Nobel 
prize for physics for this and other pioneering work involving atomic beams. 
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What do we conclude? The experiment can only mean that instead of there being 
a range of possible values of uz, ranging from —|,2| to ||, there are just two. 
Because of the intimate connection between magnetic dipole moment and angular 
momentum, the experiment implies that there are just two possible values of the 
z-component of the atom’s angular momentum. The z-component of angular 
momentum is quantized! 


This result is radical, even though Bohr had earlier suggested the quantization of 
angular momentum in his semi-quantum model of the hydrogen atom. Consider 
an analogy: if we know that a particle has speed v and there is nothing to favour 
one direction for the velocity v over another, then surely v, can take any value 
between —|v| and |v|; more specifically, v, = |v| cos 0, where 0 is the angle 
between v and the z-axis, ranging continuously from 0 to 7. The Stern—Gerlach 
experiment showed that u, and L, do not have continuous values like this. 


Why should the z-components be special? There is absolutely nothing in the 
oven, where the silver atoms came from, to favour one orientation of axes over 
another. With the oven in a fixed orientation, and the whole magnet turned 90° 
about the axis defined by the beam, we would still find two spots. We therefore 
conclude that the z-component and the y-component of angular momentum are 
quantized as well. 


The number of allowed values of angular momentum depends on the type of atom 
considered. In the case of silver atoms, there are just two allowed values of each 
angular momentum component. That is why Stern and Gerlach observed silver 
atoms appearing in two regions of their glass-plate detector. We shall return to this 
point later, but the important conclusion for present purposes is that angular 
momentum is quantized. We shall now develop the quantum-mechanical theory of 
angular momentum, with the aim of explaining this fact. 


2.3 Angular momentum in quantum mechanics 


Experiment has told us that angular momentum is quantized, so we now take up 
the challenge of seeing how this comes about. The first step follows much the 
same route as that taken earlier in the course, leading to the quantization of energy. 


2.3.1 Angular momentum operators La l; and Le 


In quantum mechanics, an observable such as L; is represented by a linear 
operator which we know, from Chapter 1, must be Hermitian. Our first task in 
developing a quantum theory of angular momentum is to find suitable operators 
for the components of angular momentum. 


The method of obtaining these operators is very similar to that used to write down 
the Hamiltonian operator. First we write down an appropriate classical expression, 
then we replace variables by their corresponding operators. For momentum 


2.3 Angular momentum in quantum mechanics 


components, we use the standard replacements 
=> Py = —ih — 

Px Pz Dx’ 

A z 0 

Py => Py = —ih By’ (2.18) 
= Pz = -—ih —. 

ve ee Oz 
For position components, we use the rule that the operator x is simply the act of 


multiplying by the variable x, with similar rules for y and Z. Applying these rules 
to the z-component of angular momentum, we obtain 


ð ð 
Oy Y ba i 


L: = £p- Yypr = L,=-ih E (2.19) 


with similar expressions for P and Ty: 
@ Write down expressions for the operators i and is 
O From Equations 2.5 and 2.6, we have 
Ly = ypz — zpy and Ly = zpz — tpz. 


Substituting Pz = —ih O/0z and the corresponding expressions for Pz and Py, 
we obtain 


ð ð 
T Z| (2.20) 


and 


(2.21) 


a o o 

Ly = —iħ E an =| 
You can verify that all of the expressions for Lis Ty, and L z a 
can be obtained from any one of them, say Equation 2.19, by 
cycling the indices wherever they appear: rt => y => z => T. 
These equations are the starting point for the quantum theory 
of angular momentum. 


2.3.2 Spherical coordinates 


For many purposes, it is better to use the spherical coordinates 
r, 0 and ġ shown in Figure 2.9, rather than the more familiar 
Cartesian coordinates x, y and z. The relationship between 
these two sets of coordinates is given by 


x = rsin ĝ cos ¢, (2.22) 

y = rsin ĝsin gd, (2.23) 

z = r cos 9, (2.24) Figure 2.9 Spherical coordinates: r is called 
cic consistent one = YFF. the radial coordinate, 0 the polar angle, and 


@ the azimuthal angle. The angle 0 ranges from 
0 to 7 and the angle ¢ ranges from 0 to 27. 
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There are many occasions where the use of spherical coordinates simplifies the 
description. For example, the following expressions both describe the potential 
energy of a point charge q due to another point charge Q at the origin: 


1 qQ 
V(r) — Anrep ae 
V(0,y,2) = — ie 


Ameo /(a2 + y? + 22) 


The first of these expressions is in spherical coordinates, while the second is 

in Cartesian coordinates; there is no doubt which looks simpler! Spherical 
coordinates are useful for systems such as atoms that have spherical symmetry. 
Angular momentum is also important in situations with spherical symmetry, so it 
will be useful to express the angular momentum operators in terms of spherical 
coordinates. 


The task of deriving expressions for Tes Ly and L z in spherical coordinates is a 
lengthy and non-trivial mathematical exercise, so we shall move directly to 
the most important conclusion. Starting from Equation 2.19, and using 
Equations 2.22-2.24, it is possible to show that 

a o 

L; = —iħ —. 225 

z De (2.25) 

This is an expression you should memorize. The expressions for Lis and Ly in 
spherical coordinates are much more complicated, but they are not needed in this 
course, so we will not write them down. 


Optional check: If you are familiar with the chain rule of partial 
differentiation, it is not difficult to check that Equation 2.25 is correct. The chain 
rule tells us that, for any function f(r, 9, ¢), 

Of — Ox Of ae ah 

06 06 0x Od Oy ð$ Oz’ 
Using Equations 2.22-2.24 to calculate the three partial derivatives 0x /0¢, 
Oy/O0¢ and 0z/0¢, we therefore see that 


i = (=r sin sin g) SA + (rsin gcos o) 24 =-y a! +2 E 


so Equation 2.25 is consistent with Equation 2.19. 


2.3.3 Quantization of L, 


In general, the allowed values of an observable O are the eigenvalues of the 
corresponding quantum-mechanical operator O. We wish to find the allowed 
values of the angular momentum component L,, and we can do this by finding the 
eigenvalues of the operator L. 


Expressed in spherical coordinates, the eigenvalue equation for L, takes the form 


-in 550(0. 8,0) = ap(r,9,¢), (2.26) 
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where a is a constant. The operator —iñ O/O¢ does not affect the r and 0 
coordinates, so we can assume that the eigenfunctions are of the form 


y(r, 0, $) = g(r, 9) F (O). (2.27) 


Substituting into Equation 2.26 and cancelling through by g(r, 0), we then obtain 
the differential equation 


dfl) An ordinary derivative is used 
=n dọ of (4), 229) because f(¢) is a function of a 
which has the general solution single variable. 
f(g) = Aelo?/h = Aci, (2.29) 


where A is a constant and m = a/h. 
@ Verify that the function A el? is an eigenfunction of Le and determine its 
eigenvalue. 
O Substituting f(¢) = Ae’? into Equation 2.28, we find 


ih Za ei?) = —ih(imAe'”®) = mħ( Ae’), (2.30) 


so f(@) is an eigenfunction of L, with eigenvalue mh. 


Eigenfunctions are often required to satisfy subsidiary conditions; for example, 
the energy eigenfunctions that solve the time-independent Schrödinger 
equation are required to be continuous and finite. In the present case, we 
impose the condition that the eigenfunctions that satisfy Equation 2.28 must 
‘join up with themselves’ as ¢ goes through 360 degrees (27 radians), so that 
f(@+ 2r) = f(d). We refer to this as the single-valuedness condition. Using 
Equation 2.29, we require that 


Aeim(o+2n) — A emt, 
which gives 

emm c=], 
This last equation can be satisfied only if m is an integer (positive, negative or 
zero): 


m= 0, +1, +2, +3,.... (2.31) 


The integer m is a quantum number for L,. The eigenvalues of L z, and hence 

the allowed values of L,, are given by mA. We therefore conclude that the 

z-component of angular momentum is quantized, coming in f-sized lumps. Some authors call m the 
Because of the link between the angular momentum and the magnetic dipole azimuthal quantum number. 
moment, m is called the magnetic quantum number. 


Normalization and orthogonality of the eigenfunctions 


It is conventional to choose the constant A in Equation 2.29 to be equal to 1/v 27. 
An eigenfunction of Lz, with quantum number m, can then be written as 


4. 4 
fmlO) = ==’. (2.32) 
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This choice of A ensures that 
2m 


" Zagat f” mag= 2 dġ=1 2.33 
f limtoPao= sf Pasif aL e33) 


and we say that the eigenfunction is normalized . 


In a similar vein, we can show that two different eigenfunctions, f,,(@) and 
fn(@), obey 


20 


Fal) fn() dd = 0 form # n, (2.34) 


0 


and we describe this by saying that different eigenfunctions of Ly are orthogonal . 


Exercise 2.3 Verify Equation 2.34. E 


2.3.4 Quantization of L,, Ly and L 


So far, we have considered the z-component of angular momentum, but 

what about the other two components? Our reason for concentrating on the 
z-component is that the operator L z has a simple form in spherical coordinates. 
The same cannot be said for Ea and Ly, but their more complicated expressions 
are not needed here so we shall not write them down. 


However, we are free to choose any direction for the z-axis and, no matter 

what direction is chosen, we will always find that the z-component of angular 
momentum is equal to mh, where m is an integer. This means that the angular 
momentum component, taken along any direction in space, can only have the 
values 0, +h, +2h,.... It follows without further calculation that the possible 
values of L, and Ly are 0, th, +2h,.... All three operators, Lis, Ly and L ~ have 
the same set of eigenvalues. 


There is one very important point about these eigenvalues that must be 
understood. It is not possible to find a state in which, for example, Ly = 2h, 

Ly = —hand L, = 3h. This is because any state that has a definite non-zero 
value of L, does not have definite values of Ly and Ly. You will see the reasons 
for this later, but you can think of it as being analogous to the fact that a particle 
cannot simultaneously have definite values of both position and momentum. 


We must also consider the magnitude of the angular momentum, L, or its 
square, L?. For example, the classical expression for the rotational energy of a 
rigid body, rotating about a fixed axis, is Z? /27, where T is the moment of inertia 
of the body. We therefore need to discuss the quantization of LŽ. 


The quantum-mechanical operator corresponding to L? is 
a2 a2 a2 a2 
L = L; + L} + Lz. 
With some effort, we could use standard results for La, Ly and ie to express this 


as an explicit differential operator. The result would be quite messy, but we could 


; T ; ; ; a2 : 
then, in principle, write down an eigenvalue equation forL and solve it to find 
the eigenfunctions and eigenvalues; the eigenvalues are the possible values of IŽ. 
Here, we simply state the result that emerges for these eigenvalues. 


a2 
The eigenvalues of L 


2.3 Angular momentum in quantum mechanics 


a2 

The eigenvalues of L` are given by the formula (1 + 1)h?, where / is any 
non-negative integer, including zero: l = 0,1,2,3,.... So the first few 
allowed values of L? are 0, 2h”, 6h?, 12h?,.... The integer l is often called 


the orbital angular momentum quantum number. 


2.3.5 The spectra of rotating molecules 


When infrared radiation with a wide and continuous range of frequencies passes 
through pure hydrogen chloride (HCI) gas, the radiation is absorbed at a series of 
specific frequencies. The top part of Figure 2.10 shows a number of dips in the 
intensity of radiation transmitted through HCl gas; these dips occur at frequencies 


at which the radiation is strongly absorbed. 


0.0385 eV 


{= 4 ———__1_, l 0.0256 ev 


f= 3 ——______+__4. 0,1 54 eV 


A 


0.0077 eV 


0.00256 eV 
OeV 


> 
wavelength / pm 


AEJeN 


Figure 2.10 An absorption 
spectrum for the transmission of 
infrared radiation through HCl 
gas (top). The dips correspond 
to the frequencies at which the 
radiation is absorbed. The 
bottom part of the figure shows 
energy levels and transitions that 
account for the measured 
spectrum. 


The lower part of Figure 2.10 gives an analysis that explains where the pattern of 
absorbed frequencies comes from. Each absorption frequency f arises from a 
transition between two energy levels separated by an energy hf. The energy 
levels marked in the figure account for the dips in the measured absorption 
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spectrum. In the following exercise, you can check that these energy levels are 
proportional to I(l + 1), where l = 0,1, 2,.... 


Exercise 2.4 The lower part of Figure 2.10 shows an excited state at an energy 
of 0.00256 eV above the ground state. Calculate from this the next two energy 
levels, assuming that the energies are proportional to I(l + 1), where the ground 
state has | = 0, the 0.00256 eV state has l = 1, etc. Compare your answers with 
the energies given in the figure, and comment on this. | 


It is not difficult to understand where the I(l + 1) rule comes from. We begin with 
a classical model. The HCI molecule, consisting of one hydrogen and one 
chlorine atom, can be treated as a rigid body, rather like a lop-sided dumbbell. 
The rotational energy of the molecule can then be expressed as 


L? 
oT’ 
where I is the moment of inertia about any axis that is perpendicular to the line 
joining the two atoms and passes through the centre of mass of the molecule. 


Exot = (Eqn 2.10) 


You might wonder whether this formula (which applies to rotations about a fixed 
axis) is valid for a molecule that tumbles through space. A more general formula 
for the rotational energy of the molecule is 

L l Ee 


= 
OT, OE oT, 


where the z-axis is chosen to be along the line joining the atoms, and the x- and 
y-axes are perpendicular to this line. The moments of inertia Jy, Iy and I, refer to 
rotations about these axes. Because of the linear shape of the HCI molecule, we 
can safely neglect L; in comparison to Ly and Lg. Also, by symmetry, we have 
I, = Iy = I, so we see that 
_ L + I 

rot 2T 2T , 

as assumed. 


Equation 2.10 is an expression for the rotational energy of the molecule expressed 
in terms of a momentum (albeit an angular momentum). It is the Hamiltonian 
function for the rotational energy of the molecule (there is no potential energy 
term in this case). The corresponding Hamiltonian operator H is found by 
; u ee ; : sey : 

replacing L? with L‘, so that the time-independent Schrödinger equation for the 
rotating molecule is 

a2 

L 


HY = — ov = EY. 
T (2.35) 


The energy eigenvalues of this equation are the allowed rotational energies of 

the molecule. But H is just a constant (1/21) times i, and we know that the 
a2 

eigenvalues of L` are I(l + 1)h? for l = 0,1,2,.... Hence we conclude that the 

rotational energy levels of the molecule are 

ll +1)h? 


E= 
21 


for! = 0,1,2,..., 


2.4 The conservation of angular momentum 


which agrees with our analysis of the spectrum in Figure 2.10. 


Perhaps you think that such a simple model for an HCI molecule is too good to be 
true. This molecule is, after all, a complicated system containing 18 electrons and 
two nuclei. Well, you would be right — there are many other states corresponding 
to the vibration of the molecule, and different arrangements of the electrons, but 
these other states are at much higher energies. The low-energy states of HCI are 
quite well described by our rotational model. Moreover, we can learn a basic fact 
about the HCl molecule from the spacing between the levels, which depends upon 
the moment of inertia, J. Since the distance between the hydrogen and chlorine 
atoms determines J, this distance can be deduced from the measured spacing of 
the energy levels. It turns out to be 0.13 nm. 


2.4 The conservation of angular momentum 


Now that we have operators that describe angular momentum, we can consider the 
quantum-mechanical version of the law of conservation of angular momentum. 
Our starting point is the generalized Ehrenfest theorem derived in the preceding 
chapter. This tells us the time-development of expectation values; if we have an 
observable A in a given system, the rate of change of its expectation value is 

d(A) l/s 

<< = = ((A, fi) Eqn 1.46 
where H is the Hamiltonian operator for the system under consideration. We will 
now apply this general result to the components of angular momentum. 


Because our choice of axes is arbitrary, it does not matter, physically, which 
component of angular momentum we choose to consider. As usual, we shall 
concentrate on the z-component, for which 

Sa = = (anj: (2.36) 
We immediately see that the expectation value of L, will remain constant in time 
provided that L Zz) fi = 0. We must therefore investigate whether L, commutes 
with the Hamiltonian operator f. 


The simplest case to consider is that of a free particle of mass m, for which the 
Hamiltonian operator has the form 


a 1 pao | a2, a2 
Noting that L,= XP, — F Pz, we have 
T A lige 24 aaa? 

[L., H] = Əm RP, — Y Pr» Pz + Py + Pz] : 
This looks a lot worse than it is! Most of the operators commute with one another 
— the momentum operators all commute with one another, and P, commutes 
with y, for example. The only pairs of operators that do not commute with one 
another are £ and 2, and F and Da: and these are the only pairs of operators 


for which order matters. Bearing these points in mind, the above expression 
simplifies to 


a o. 1 ADE ER ere 
[L H] = als D] Py — [F D3] al 
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The commutator Es | may look familiar: it was evaluated in Worked 
Example 1.2 of Chapter 1, where we found that 


[%, Bz] = 2ih By. 


Obviously, a similar result applies to [F, Pi], which is equal to 2ih p,, so we 
conclude that 

os l seer ae 

[L., H] E7 (2inb.B, — 2iND,Be) = 0. 
The z-component of angular momentum commutes with the Hamiltonian 
operator, so the expectation value of L; is conserved for a free particle. This is 
perhaps not a surprise, but we have given a solid proof of this fact using the 
principles of quantum mechanics. 


Now let us consider a system analogous to a planet orbiting the Sun. In classical 
physics, the planet experiences a force from the Sun, but this force is central, 
acting along the line joining the planet to the Sun. Under these circumstances, 
classical physics predicts that the angular momentum of the planet is conserved, 
and Kepler’s second law is an observed consequence of this conservation law. 


In quantum mechanics, we generally deal with potential energy functions rather 
than forces. The statement that a force is central about a given point O is 
equivalent to saying that the potential energy function is spherically symmetric 
about O. So if we choose O as our origin and use spherical coordinates, the 
potential energy function takes the form V (r), which depends only on the radial 
coordinate, and does not depend on the angular coordinates 0 and ¢. 


If we consider a system with the Hamiltonian operator 


ae ae ee 
= 5, (Be + By +B2) + V(r), (2.37) 


T) 


we can ask whether L z and H are commuting operators. We already know that L 7 
commutes with the kinetic energy term. To see whether È z also commutes 

with the potential energy term, it is best to use the spherical coordinate form 

L, = —iħ 0/O¢. Then, for any function f(r, 0, ¢), we have 


Pe ð 
Lz V(r) f(r,80,ġ) = —iħ Ig (V(r) Flee, ¢)) 
o 
—ih V(r) ag 0, Q) 
= V(r) Lz f(r, @,.0). 
Since this is true for any function f(r, 0, ), we conclude that L commutes with 
the potential energy term, as well as the kinetic energy term, so it commutes with 


the whole Hamiltonian operator. It follows that the expectation value of L, is 
conserved. 


We have concentrated on L, for mathematical reasons, taking advantage of 

the fact that it is described by a nice simple operator in spherical coordinates. 
However, the Hamiltonian operator in Equation 2.37 is spherically symmetric, in 
the sense that it does not depend on how we orient our axes. Since we can choose 
the z-axis to point in any direction in space, this means that the expectation value 


2.5 Compatible and incompatible observables 


of the angular momentum component in any chosen direction is conserved. This 
then implies that the expectation values of L, and L, are conserved as well as the 
expectation value of L,. 


We see here a particular example of the link between symmetry and conservation 
laws described in general terms in Chapter 1. Symmetries give rise to 
conservation laws. In the present context, the system is spherically symmetric and 
this symmetry gives rise to the conservation of angular momentum, which is 
expressed in quantum mechanics by the fact that the expectation value of any 
component of angular momentum remains constant in time. 


We can also meet situations in which the potential energy function depends on 
angle. 


Exercise 2.5 Does the operator L, commute with the potential energy function 
in the following cases: (a) a potential energy function V (r, 8) that depends 

on the spherical coordinates r and 0, but is independent of ¢; (b) a potential 
energy function V(r, œ) that depends on the spherical coordinates r and ¢, but is 
independent of 0? 


In each case discuss whether a particle subject to the given potential energy 
function will have a constant expectation value of Lz. E 


2.5 Compatible and incompatible observables 


Section 2.3.4 raised an important issue, which we shall now discuss in detail. We 
said that it is impossible to find a state in which L, has a definite non-zero 
value, and Ly and L, also have definite values. More generally, if any of these 
observables has a definite non-zero value, the other two will have uncertain 
values. This means that we cannot label an arbitrary state by giving simultaneous 
values for Ly, Ly and Lz: these three observables are said to be incompatible. 


2.5.1 Commutation relations and compatibility 


One way to understand the incompatibility of Ly, Ly and L, is to use the 
generalized uncertainty principle discussed in the previous chapter. This tells us 
that the product of the uncertainties of two observables A and B obeys 


AAAB > 3|([A,B])], (2.38) 
where the uncertainties on the left and the expectation value on the right are 
evaluated in the same state. It turns out to be very difficult (often impossible) to In fact, the state involved would 
satisfy this inequality if AA = 0, AB = 0, and [A,B] Æ 0. The physical have to be an eigenvector of the 


interpretation is that observables A and B cannot both have definite values inthe operator [A, B], with zero 
same state (ignoring rare exceptions) if the operators A and B do not to commute eigenvalue; such states are truly 
with one another. exceptional. 


To apply these ideas to angular momentum, we must evaluate commutators of 
different angular momentum operators. The following worked example illustrates 
how this is done. 


Sfi 
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Essential skill Worked Example 2.1 
Evaluating a commutator Evaluate the commutator [L,, Ly]. 
Solution 


It is convenient here to work in Cartesian coordinates. First let L;L, act on 
an arbitrary function f(x,y,z), to obtain 


bay) =-P(v 2 -22)(-2-28)s 


Oz Oy "Fi 

o o O o O o o O 
= my x =) TAG sr) EAE =) +25 (2 2] 
== ly = + four terms involving second-order partial derivatives| : 


Now, if we operate on f(x,y,z) with Eel, 


bir" h-aB) (og -=B)s 


= (05,) a 2 #5) #73 Hy) 


o 
= —f? E = + four terms involving second-order partial derivatives| : 
y 


It turns out that the terms involving second-order partial derivatives are the 
same in both L,L, f and L} Ly f, so we are left with 


Cat -Ey (p-e 
= ih x ( in) (« ya) s= ins. 


Since this equation applies for any function f(x,y,z), we can write it as an 
identity between operators: 


fit e 


You saw earlier that a cyclic permutation x y z x converts an 
equation for one angular momentum component into an equally valid equation for 
another angular momentum component. The same transformation will convert the 
commutation relation for Ez L] into valid commutation relations for other 
angular momentum components. The complete set of commutation relations is 


You need remember only i ll ae ite (2.39) 
one of these commutation Fak uae 

relations, obtaining the others by [Ly, L.| = ihLe, C=) 
cyclically permutating the ee Le] =ih i. (2.41) 
subscripts. 


It is very significant that different components of angular momentum do not 
commute with one another. 
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Using these commutation relations in the generalized uncertainty principle 
(Equation 2.38), we see that 


de>!) 


AL AG, 


IV 


AL, AL, 


IV 


These inequalities prevent any pair of angular momentum components from 
having definite non-zero values in the same state. For, let us suppose that L, has 
the definite value m,ħ and, in the same state, Ly has the definite value m,h. 
Since L, and Ly have definite values, we have AL, = AL} = 0. The above 
inequalities then tell us that all three expectation values, (Le), (Ly) and (Lz), 
must be zero, som; = Mz = 0. Hence, it is impossible for L, and Ly to have 
definite values in the same state, unless both of these values are equal to zero. For 
example, the ground state of a hydrogen atom happens to be a state in which all 
three components of the angular momentum vanish. Apart from this notable 
exception, no other state of the atom can be labelled by the values of more than 
one component of angular momentum. 


We can also consider the compatibility of a component of angular momentum 


A AQ 
(say Lz) and L?. The key issue is whether the operators L; and L° commute with 
one another. It is possible to show (we do not give the proof here) that 


ae |e 0: (2.42) 


A2 f 

In other words, L commutes with all three components of angular momentum. 
This means that the generalized uncertainty principle places no bar on ZÊ and, 
say, L, having definite values in the same state. 


Strictly speaking, the fact that the generalized uncertainty principle raises no 

objections does not show that it is possible to find states in which Z? and L, both 

have definite values. However, there is a separate theorem (again not proved here) 

that if a set of Hermitian operators A, B, ... all commute with one another, then 

their eigenfunctions can always be chosen to be simultaneous eigenfunctions of 

all the mutually commuting operators. Given this fact, we can be confident that 

states can be found in which both Z? and L, have definite values. L? and L, are said to be 
compatible observables. 


AQ a 
2.5.2 Simultaneous eigenfunctions of L and L, 


We have just seen that the operators T and L z commute with one another, and 
that this means that these operators have a set of simultaneous eigenfunctions, 
representing states in which both L? and L, have definite values. We shall now 
describe these states in more detail. 


In spherical coordinates, the operator L is a complex expression involving 0 

and @. Earlier on, we did not bother to write this down, but asked you to imagine 
the process of forming the appropriate eigenvalue equation and finding its 
eigenvalues and eigenfunctions. The eigenvalues are of the form I(l + 1)h?, where 
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l =0,1,2,.... The eigenfunctions are functions of 8 and ¢, such as cos 0 e'®, but 
again we shall avoid the details in this book. 


; ; : ; ; ; 52 
The term eigenstate is Fortunately, Dirac notation allows us to specify particular eigenstates of L 
sometimes used for a quantum and L, without writing down the lengthy expressions for the operators or 
state that is represented by an eigenfunctions. We shall use the notation |l, m} to represent an eigenfunction of 


eigenfunction or eigenvector of 


AQ A 
À L with eigenvalue I(l + 1)h?, that is simultaneously an eigenfunction of L, with 
a given operator. 


eigenvalue mh, so we have 


T'I m) = 11 + 1)R2 Il, m), (2.43) 
Lall m) = mil, m). (2.44) 


Note that the ket vector is labelled not by the eigenvalues but by the quantum 
numbers l and m, which are sufficient to specify the eigenvalues 1(1 + 1)h? and 
mh, respectively. 


Now, the quantum numbers / and m are not completely independent of one 
another. This is not surprising: it would certainly be very strange if the 
z-component of the angular momentum were greater in magnitude than the 
angular momentum itself. We therefore have the condition |mf| < /I(I + 1)h. 
Given that m and l are integers, this implies that |m| < l. It turns out that m can 
have all integer values consistent with this condition; in other words, for a given 
value of l, the possible values of m are 


m=0, +1, +2,..., +l, 


that is, 27 + 1 values in all. 


Exercise 2.6 
(a) Write out all the possible kets associated with quantum number l = 3. 


: ; EE ; =~ a2 
(b) Use ket notation to write out two explicit eigenvalue equations, for L, and L , 
for the state in which l = 4 and m has its minimum possible value. a 


2.5.3 Describing angular momentum and energy 


We often have to deal with a spherically-symmetric system — that is, a system 
whose Hamiltonian does not depend on the orientation of our coordinate axes. 
An isolated atom is a good example. Where there is spherical symmetry, the 
expectation values of Ly, Ly and L, are all conserved. We shall now consider 
another aspect of spherical symmetry, which is important for the way we describe 
atoms. 


In a spherically-symmetric system, we have seen that the operators lig Ly and L z 
a a2 a2 
all commute with the Hamiltonian operator H. It immediately follows that L}, Ly 


AQ eee 
and L, also commute with H. For example, we have 


ADA Pe aes ee es Ae: os AA 
LH=LLH= LHL =HL,L, S= HL 


z 


Consequently, i = ie + iB + Te also commutes with H. We therefore have the 
following situation: 
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a a2 
e L, commutes with L (it always does); 


° L commutes with H (in a spherically-symmetric system); 


a2 


e L commutes with H (in a spherically-symmetric system). 


So, in a spherically-symmetric system, bz i and H form a set of mutually 
commuting operators, in which each operator commutes with the other two. It is 
therefore possible to find a set of functions that are simultaneous eigenfunctions 
of all three operators, and these eigenfunctions correspond to states in which 

the system simultaneously has definite values of the z-component of angular 
momentum, the square of the magnitude of the angular momentum, and the 
energy. This allows us to put labels on the energy eigenfunctions of a hydrogen 
atom, as you will see later in the course. This result is very important because the 
spectral lines that are emitted by atoms in the laboratory correspond to transitions 
between quantum states that can be labelled by angular momentum quantum 
numbers. 


2.5.4 A two-dimensional model ‘atom’ 


The above ideas can be illustrated with a simple model. In this chapter, we cannot If you are really 

discuss real, three-dimensional atoms. Instead, we do what physicists often do to pressed for study time, 
get insight into a tricky problem. We consider an analogous system in a smaller Section 2.5.4 could be 
number of dimensions. Even though unrealistic, such models often lead to omitted at this stage. We 
insights into the workings of real systems. shall return to this topic in 


In this spirit, we consider a two-dimensional ‘atom’ consisting of a particle of Chap O booe 


mass M, confined to the z = 0 plane and subject to an attractive potential energy 
function V(x, y) that depends only on x and y. The only angular momentum 
component we need to consider is L,, represented by the operator L}. There is no 


a2 
need to consider a separate operator L . 

@ Write down the Hamiltonian operator and Schrédinger’s time-independent 
equation for a particle of mass M subject to such a potential energy function. 

O The Hamiltonian operator is the sum of kinetic and potential energy terms 
R2 fag 0? 
IT DD 
2M | 0x Oy 


so the time-independent Schrédinger equation is 


f= 


| +V(2z,y), (2.45) 


h? [Pyr y) | Fv(a,y) 
i i V =F . (2.46 
rar | pe +t SSP] +¥ (eu) pley) = Ban). 046) 
We shall suppose that V (x, y) is symmetric under rotations, i.e. it depends only Figure 2.11 Polar 
on the distance \/x? + y? from the origin. This prompts us to use the polar coordinates r and ¢ can be used 
coordinates r and ¢ shown in Figure 2.11. The potential energy function is then instead of x and y to specify the 
written as V(r) and the energy eigenfunction as w(r, @). position of a point in a plane. 


It is also necessary to transform the kinetic energy term in the Hamiltonian 
operator into polar coordinates. This is a tedious task, so we shall simply state the 
result: the Hamiltonian operator becomes 
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z P [Ë 28 1 @ 
You are not expected to recall or H= 5+ Se AT ea t V(r). (2.47) 
derive the form of H in polar 2M | dr rör r? Ob 
coordinates. Look at the third term in the square brackets, involving the second derivative 


g . p2 : 
of ¢. It is just 1/2Mr? times L,, because we can write 


a2 o o 3? 
T= (-in—) (-in =) = a 2.48 
= (ingg) (“i 55 Be (2.48) 
As a result, Schrédinger’s time-independent equation for a particle of mass M, 


moving in two dimensions under the influence of a rotationally-symmetric 
potential energy function, can be written as 


h? E 2 oe) 1 


a2 
wg lar ta el tage Fe VO OFV(N) U7, 8) = Elro). 2.49) 
Rather than solve this equation for any specific V(r), we look at properties that 
apply for any V (r). We shall look for solutions that are in the product form 
v(r, ¢) = R(r) f(¢). Substituting this into Equation 2.49 and using the usual 
technique of separation of variables, we obtain two ordinary differential equations 
linked by a separation constant, K: 


1 x2 B R2 d? f B 
h? [dR 2dR K 
2M É r =| + V(r) R(r) = lz T | R(r). (2.51) 


The first of these equations is easily solved with the aid of Equation 2.30. 


@ Show that f(¢) = e”? is an eigenfunction of Equation 2.50, with eigenvalue 
K = f?m?/2M. 


O Substituting the given expression for f(@) into Equation 2.50, we obtain 


2 2 Qin ed 
FS (emo) EM sims 
aM °° 


from which we conclude that e’’”® is an eigenfunction, with eigenvalue 
K = h?m?/2M. 


The function f(¢) must be single-valued as before, so m is restricted to positive, 
negative or zero integer values. Using K = h?m?/2M, the differential equation 
for R(r) can be written as 


h? É 2 h?m? 


ou | ae ae ay aur? R(r)+ V(r) R(r)= E R(r). (2.52) 
We can imagine solving this equation for each allowed value of m. The solutions 
have to satisfy various boundary conditions (they must not diverge at infinity, for 
example) and this generally means that, for each value of m, there is a discrete 
set of solutions, which we label by another quantum number, n. The allowed 
radial solutions are therefore written as Ram(r), and the corresponding energy 
eigenvalues as Enm.- 
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2.6 A remaining puzzle 


Assembling the product solutions from their separate parts, we conclude that the 
time-independent Schrédinger equation has a discrete set of solutions 


Ymn(r, 9) = Rmn (r) 9. (2.53) 


The function in Equation 2.53 is an eigenfunction of the Hamiltonian operator f, 
with eigenvalue Enm, and it is simultaneously an eigenfunction of L z, with 
eigenvalue mh. Our simple model therefore provides a concrete example of the 
way in which energy eigenfunctions and eigenvalues can be labelled by sets of 
quantum numbers that include those for angular momentum, a consequence of the 
fact that the potential energy function has rotational symmetry. 


2.6 A remaining puzzle 


The quantum theory of angular momentum presented in this chapter has had a 
number of successes: 


1. Quantization We showed why the components of angular momentum come 
in h-sized lumps. 


2. Rotating molecules We explained the overall pattern of the infrared 
absorption spectrum produced by rotating hydrogen chloride molecules. 


3. The labelling of atomic states We showed how atomic states can be labelled, 
using the quantum numbers m and / that determine the z-component and the 
magnitude of the angular momentum. The correct labelling of atomic states is a 
first step towards understanding atomic spectra. 


The Stern—Gerlach experiment supports the idea that angular momentum is 
quantized, and so can be regarded as providing further evidence in favour of our 
theory. However, there is a snag. When Stern and Gerlach carried out their 
experiment on silver atoms, they observed just two regions where silver was 
deposited on their detecting screen. However, you have seen that the allowed 
values of / are 0, 1, 2, ... (integers) and that, for each value of l, there are 2/ + 1 
different values of m (0, +1, +2, +/). Each value of m gives a different value of 
Lz, and hence a different value of uz. We would therefore expect to find silver 
atoms appearing at 2l + 1 places; this is always an odd number because / is an 
integer. The fact that silver atoms are detected at two places is therefore beyond 
the powers of explanation of this chapter. 


The solution to this difficulty will be given in the next chapter. You will see that 
there is a different type of angular momentum, not describable in classical terms, 
whose components have two different values, +//2 and —h/2. This new form of 
angular momentum is called spin. 
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Summary of Chapter 2 


Section 2.1 In classical physics, the z-component of the angular momentum of a 
particle is L = xp, — ypz. The x- and y-components can be obtained from this 
by acyclic permutation of the subscripts: x y z x. The magnitude 
of the angular momentum of a rigid body rotating about a fixed axis is L = Iw, 
where I is the moment of inertia about the axis and w is the angular speed of the 
body. The corresponding rotational energy is Ey, = L7/2I. 


Section 2.2 Many atoms, nuclei and particles behave as magnetic dipoles, 
characterized by a magnetic dipole moment u. In a magnetic field B, a magnetic 
dipole has potential energy Emag = —y + B. In a non-uniform magnetic 

field pointing in the z-direction, a magnetic dipole experiences a force 

F, = p,0B,/0z. The magnetic dipole moment due to an orbiting charge is 
related to the orbital angular momentum by u = yL, where y is the gyromagnetic 
ratio. 


The Stern—Gerlach experiment shows that the magnetic dipole moments of 
atoms are quantized. Because of the intimate link between magnetic dipole 
moments and angular momentum, the experiment also provides evidence for the 
quantization of angular momentum. 


Section 2.3 The quantum-mechanical operator for the z-component of angular 
momentum is L, = (—ih)(a« 0/dy — yO/0zx). Similar results for Ly and Ly are 
obtained by cyclic permutation of the subscripts. In spherical coordinates, 

A . O 

L, = —-ih ae 
The eigenfunctions of Lig are of the form Ae'’”®, and single-valuedness 
imposes the requirement that m is an integer (positive, negative or zero). The 
corresponding eigenvalues are mh. 


The square of the magnitude of angular momentum is represented by the operator 
a2 a2 a2 a2 

L = L, +L, + L,. This has eigenvalues I(l + 1)h?, where | is any non-negative 
integer. Using these eigenvalues we can explain the infrared absorption spectrum 
of HCl, caused by transitions between rotational energy levels. 


Section 2.4 The expectation value L, remains constant in time if L z commutes 
with the Hamiltonian operator of the system. The expectation values of all the 
components of angular momentum are conserved if the Hamiltonian operator 

is spherically symmetric. This is an example of the profound link between 
symmetries and conservation laws. 


Section 2.5 Mutually commuting Hermitian operators have simultaneous 
eigenfunctions. Angular momentum operators obey the commutation relations 
aoa a ~a a2 
[Le,Ly] =iñL; and [L;,L ] = 0, 
with similar results obtained by cyclic permutation. Apart from an exceptional 
case, where all three angular momentum components are equal to zero, it is 


impossible for any two components of angular momentum to have definite values 
in the same state. However, it is possible to find states |l, m) that are simultaneous 


eigenfunctions of if z and i with 
L.|l,m) =mAll,m) and Ñ’, m) = (1+ 1)A2|L,m). 
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Achievements from Chapter 2 


For a given value of l, the values of m are restricted tom = 0,+1,+2,...,+1. 


The energy eigenfunctions of any spherically-symmetric system can be chosen to 
. ~ a2 
be simultaneous eigenfunctions of H, L, andL . 


Section 2.6 The two lines appearing in the Stern—Gerlach experiment for silver 
atoms cannot be explained by the otherwise successful theory. They point to a 
distinct (non-orbital) kind of angular momentum called spin. 


Achievements from Chapter 2 


After studying this chapter, you should be able to: 


2.1 Explain the meanings of the newly defined (emboldened) terms and 
symbols, and use them appropriately. 


2.2 Give classical expressions for the angular momentum of a particle, and for 
the angular momentum and rotational kinetic energy of a rigid body rotating 
about a fixed axis. 


2.3 Give an account of the behaviour of a magnetic dipole in uniform and 
non-uniform magnetic fields. 


2.4 Describe the relationship between angular momentum and magnetic dipole 
moment, and define the gyromagnetic ratio. 


2.5 Give an account of the Stern—Gerlach experiment, its interpretation and its 
significance. 


2.6 Write down expressions for the angular momentum operators Lz, Ly and Lz 
in Cartesian coordinates. 


2.7 Recall the expression for i} z in spherical coordinates, and obtain its 
eigenfunctions and eigenvalues. 
a2 
2.8 Recall the eigenvalues of L , and give the allowed values of L, for a given 
value of L?, 


2.9 State and use the basic commutation relations for angular momentum 
operators. 


2.10 Discuss the conservation of angular momentum in quantum mechanics, and 
interpret this in terms of rotational symmetry. 


2.11 Explain why different components of angular momentum cannot 
simultaneously have non-zero values; discuss the labelling of energy 
eigenfunctions by angular momentum quantum numbers. 
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Chapter 3 Spin angular momentum 


Introduction 


The previous chapter ended with a puzzle. There was a hint that the solution to 
this puzzle is a new kind of angular momentum, one without parallel in the 
everyday world. This chapter introduces this new property: spin. To give a 
quantum-mechanical description of spin, we must go beyond the wave mechanics 
of Book 1 and call upon the more general ideas introduced in Chapter 1, including 
the representation of quantum states as vectors in a vector space. 


This chapter falls into two unequal parts: the first section uses the Stern—Gerlach 
experiment to set out some basic phenomena that the formalism must explain. 
The rest of the chapter is devoted to a step-by-step presentation of the quantum 
theory of spin. This theory will be used throughout the rest of this course, to 
explain the behaviour of identical particles in atoms and solids and to explore 
mysterious phenomena such as quantum entanglement and quantum teleportation, 
which lie at the frontiers of our current understanding of the quantum world. 


The structure of this chapter is as follows. Section 3.1 returns to the 
Stern—Gerlach experiment, presenting a range of characteristic phenomena that a 
quantum theory of spin must explain. Section 3.2 describes how spin states can be 
represented by vectors in a two-dimensional vector space, and Section 3.3 shows 
how observable quantities related to spin can be represented by matrices. These 
descriptions are very different to those of wave mechanics, where quantum states 
were represented by wave functions and observables by differential operators. 
Nevertheless, both wave mechanics and spin theory have a common structure, and 
many similarities become evident in the Dirac notation of Chapter 1. 


As always in quantum mechanics, we are interested in calculating the probabilities 
of the outcomes of experiments. Section 3.4 shows how this is done for spin 
measurements, and goes on to calculate expectation values. Like all states, the 
state describing the spin of a particle can change in time. Section 3.5 shows how 
Schrédinger’s equation can be applied to spin states, and uses it to predict the 
time-development of a spin state in the presence of an external magnetic field. 


3.1 Spin: what the formalism must explain 


In this section we examine several variations of the Stern—Gerlach experiment 
which reveal the basic phenomena that a quantum theory of spin must reproduce. 


3.1.1 Observations with Stern—Gerlach apparatus 


Figure 3.1 shows a schematic diagram of the simplest type of Stern—Gerlach 
experiment. In this arrangement, a beam of silver atoms from an oven is 

sent through an inhomogeneous magnetic field produced by a magnet with 
specially-shaped pole pieces — a Stern—Gerlach magnet, as described in 

Chapter 2. Throughout this discussion, we shall take the pointed pole piece to be a 
north pole (N), and the notched pole piece to be a south pole (S). The red arrow in 
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3.1 Spin: what the formalism must explain 


Figure 3.1 indicates the orientation vector of the magnet. This is chosen to point 
along the line of symmetry from the south pole piece to the north pole piece, in 
the direction of increasing magnetic field strength; it will help us compare the 
alignments of different Stern—Gerlach magnets. 


We use a fixed coordinate system, x, y, z; in Figure 3.1 the beam is incident along 
the y-direction and the magnet is oriented in the z-direction. These choices will 
simplify the analysis in subsequent sections. In this idealization, all details of the 
oven, the collimation of the beam, the production of a high vacuum for the beam, 
and the detection process have been omitted. 


orientation 
of magnet 


direction 


z of beam detector 


(a) 


Figure 3.1 shows that an incident beam of silver atoms is split into two emerging 
beams (or components), which are detected when they strike a measurement 
screen. One component consists of atoms that have been deflected in the direction 
of the orientation vector of the magnet, which in this case is the positive 
z-direction. The other component consists of atoms that have been deflected in 
the opposite direction. The key result is that only two components are detected, 
corresponding to two allowed values of j1,, the z-component of the magnetic 
dipole moment of the atoms. Classically, one would expect a continuous 
distribution of deflections, corresponding to a continuous range of allowed values 
of uz. 


In the previous chapter, the quantization of u, was attributed to the 

quantization of the z-component of orbital angular momentum. However, the 
quantum-mechanical theory of orbital angular momentum predicts that there is 
always an odd number of emerging beams (1, 3, 5, etc.), and this cannot be 
reconciled with the observed pair of emerging beams. We shall continue to 
assume that the magnetic dipole moments of silver atoms are due to a type of 
angular momentum, but this cannot be orbital angular momentum; it is an entirely 
new type of angular momentum called spin angular momentum or spin for 
short. Our interpretation is that a silver atom has two possible values of S_, 

the z-component of its spin. We cannot obtain the numerical values of these 

two possible values directly from the Stern—Gerlach experiment, but for the 
moment, we will simply assert that the two possible values are S, = +h/2 and 
S, = —h/2. These two values correspond to two different values of u, and hence 


With this choice, it will turn out 
that atoms with spin ‘up’ in the 
direction of the orientation 
vector are deflected upwards, 
while atoms with spin ‘down 
are deflected downwards. 


2 


(b) 


Figure 3.1 A schematic 
diagram of a Stern—Gerlach 
experiment: (a) perspective view 
and (b) cross-sectional view. 
The magnet is oriented in the 
z-direction and the beam travels 
in the y-direction. 
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produce the two emerging beams that we see in Figure 3.1. 


Of course, the atom does not know how the z-axis has been chosen, and the same 
two-valuedness of a component of spin is observed in the set-up of Figure 3.2. 


detector 


orientation 
Y of magnet 


Figure 3.2 The Stern-Gerlach magnet of Figure 3.1 is rotated so that it is 
oriented in the x-direction. 


Exercise 3.1 When the magnet of Figure 3.1 is rotated through 90° to give the 
orientation of Figure 3.2, two components are detected, deflected in the positive 
and negative x-directions. The magnitudes of the deflections are unchanged. 
Interpret this result in terms of the possible outcomes of measurements of Sy, the 
component of spin in the x-direction. E 


The next type of Stern—Gerlach experiment we consider involves two magnets in 
series. Figure 3.3 shows what is observed when we take one component of the 
beam and pass it through a second magnet with the same orientation as the first. 


only one 
component 


formed 


screen to black 
x one component 


Figure 3.3 Two identically oriented Stern—Gerlach magnets in series. A screen 
placed after the first magnet allows only the upper component of the beam to 
reach the second magnet. 
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3.1 Spin: what the formalism must explain 


The result is not very surprising. Taking the component that is deflected in the 
positive z-direction by the first magnet, we find only one component emerging 
from the second magnet, and it has been further deflected in the positive 
z-direction. But what happens if the orientation of the second magnet differs from 
that of the first? Figure 3.4 illustrates the result of such an experiment. 


equal 
intensities 
in each 
component 


Figure 3.4 Taking 
r one component from a 
Stern—Gerlach magnet oriented 
in the z-direction, we obtain two 
Y screen components, of equal intensity, 
Tr 


from a magnet oriented in the 
x-direction. (From now on, the 
magnets will be drawn in 
simplified box form.) 


When the second magnet is rotated through 90° about the incident beam, two 
components are detected, with deflections in the +x-directions, corresponding to 
atoms with S = +h/2. Moreover, equal numbers of atoms from the beam 
prepared by the first magnet are found to have Sy = +h/2 and S, = —h/2, when 
analyzed by the second magnet. 


A Stern—Gerlach apparatus that stops and detects both emerging beams will be 
called a spin analyzer. By contrast, a Stern—Gerlach apparatus that allows one of 
the emerging beams to pass undetected, while blocking off the other, will be 
called a spin preparer, because it prepares particles with a definite component of 
spin along the orientation direction of the apparatus. We adopt the convention of 
retaining the beam with a positive spin component in the orientation direction. 
Thus Figure 3.3 depicts a spin preparer oriented in the z-direction, and this 
prepares a beam of atoms with S, = +f/2. A spin analyzer oriented in the 
z-direction reveals this fact by detecting only the value S, = +h/2. The result 
shown in Figure 3.4, on the other hand, is that a spin analyzer oriented in the 
x-direction detects equal numbers of atoms with Sy = +h/2 and S, = —h/2. 


The general rule for what is found when the analyzer’s orientation makes an angle 
0 with the preparer’s orientation can be stated as follows. 


The cos?(0/2) rule 


Suppose that a spin preparer and a spin analyzer are collinear, with an 
angle 0 between their orientation vectors (Figure 3.5). Then an atom 
emerging with a positive spin component along the orientation of the spin 
preparer has a probability cos?(@/2) of being detected with a positive spin 
component along the orientation of the spin analyzer. (By our conventions, 
such an atom is deflected in the direction of the orientation vector of the 
analyzer.) 
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a fraction 
cos? (8/2) 
detected here 


Z 


ara 


Figure 3.5 An experiment in which the spin analyzer A is rotated at an angle 0 
with respect to the orientation of the spin preparer P, illustrating the co? (0/2) 
rule. 


The appearance of 9/2 rather than @ in this rule may be surprising, but spin is an 
entirely quantum-mechanical concept whose properties do not necessarily agree 
with classical intuition. It is also important to note that the cos?(0/2) rule gives 
the probability of a particular outcome; the fate of any given atom cannot be 
predicted with certainty unless @ is 0°, or is some multiple of 180°. As with all 
quantum-mechanical phenomena governed by probabilities, the probability 
cos”(@/2) estimates the fractional frequency that will be observed in the limit of 
a very large number of measurements. If only one atom passes through the 
apparatus, and 0 = 90°, we will not find that half an atom has gone each way! 


@ What is the probability that an atom’s spin component along the direction of 
the analyzer’s orientation vector will be detected to have the value —f/2? 


O There are two possible values of the atom’s spin component along the 
direction of the analyzer’s orientation vector: +h/2 and —h/2. The 
probability of getting a value +//2 is cos?(0/2), so the probability of getting 
a value —h/2 is 1 — cos?(0/2) = sin?(6/2). 


Exercise 3.2 


(a) Verify that the cos?(@/2) rule correctly describes the results of the 
experiments in Figures 3.3 and 3.4. 


(b) Use the rule to predict what will happen when the second magnet of 
Figure 3.3 is rotated by 180° about the beam. 


(c) Use the rule to predict what will happen when the second magnet of 
Figure 3.4 is rotated by 180° about the beam. | 


Now consider the three-magnet situation shown in Figure 3.6, where the second 
magnet now acts as a spin preparer, P’, and the third acts as a spin analyzer, A, 
oriented in the same direction as the first spin preparer, P. 


3.1 Spin: what the formalism must explain 


Figure 3.6 The experimental arrangement for Worked Example 3.1. 


Worked Example 3.1 Essential skill 


What fraction of the beam prepared by P in Figure 3.6 is detected in the Applying the cos?(@/2) rule 
lower component by A? 


Solution 


The angle between the orientations of P and P’ is 0, so the fraction of atoms 

prepared by P that emerges in the positive direction of the orientation vector 
of P’ is cos?(0/2). Of these atoms, a fraction 1 — cos?(9/2) = sin?(0/2) is 

deflected in the opposite direction to the orientation vector of A and is 


therefore detected in the lower component by A. So a fraction Standard trigonometrical 


cos?(6/2) sin?(6/2) = 1 sin? (0) identities are listed inside the 
back cover of the book. 
of the beam prepared by P is eventually detected in the lower component 
by A. 


Worked Example 3.1 reveals a remarkable fact: if P’ is oriented at right angles to P 
(6 = 90°), about a quarter of the atoms prepared by P will be detected in the lower 
component of A. But if P’ were removed, none of the atoms prepared by P would 
be detected in the lower component of A. The spin preparer P’ does not simply act 
as a sort of filter, selecting atoms that are somehow already predetermined to be 
deflected along its orientation vector. Instead, its presence radically changes the 
state of atoms. Some go into the blocked beam and do not emerge; the remainder 
are prepared in a state with a spin component parallel to the orientation of P’; this 
is not the same as the state of the atoms entering P’. Having passed through P’, 
these atoms have some chance of being detected in the lower component by A; 
without passing through P’, they would have no such chance. 


Exercise 3.3 Consider a series of spin preparers, P; to P,,, followed by a spin 
analyzer, A, which is oriented at right angles to the first spin preparer Pı. The 
angles between successive spin preparers, and the angle between the last spin 
preparer and A, are all equal to one another. Calculate the fraction of the beam 
prepared by P4 that is deflected along the orientation vector of A for n = 1, n = 2 
and n = 3. Evaluate your answers to two significant figures. a 
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T2 


These applications of the cos? (0/2) rule show that the presence of one or more 
spin preparers between an initial spin preparer and a final analyzer can radically 
affect the probabilities of the outcomes measured by the analyzer. 


What have we studied the properties of? 


The original Stern—Gerlach experiment was carried out with silver atoms. Why 
silver? Wouldn’t it be more interesting to study the spin of fundamental particles 
such as electrons? Maybe — but there are formidable practical difficulties in 
doing so. The reason is that electrons are charged particles, and moving charges 
experience magnetic forces in a magnetic field. Electrons would be strongly 
deflected by the magnetic fields in a Stern—Gerlach apparatus, simply because 
they are moving charged particles. Such a deflection would overwhelm the much 
more delicate deflection due to the magnetic dipole moment of an electron in an 
inhomogeneous magnetic field, and prevent us from observing effects directly 
associated with spin. 


Neutral silver atoms, however, have no charge. What is more, the electrons in a 
silver atom are arranged in such a way that the magnetic dipole moments of all 
but one of the electrons cancel out. The remaining electron has no orbital angular 
momentum (l = 0) so there is no orbital contribution to its magnetic dipole 
moment. This means that the magnetic dipole moment of a silver atom can be 
regarded as being due to the spin of a single electron. By carrying out the 
Stern—Gerlach experiment with silver atoms we are, in effect, studying the spin of 
a single free electron, without the unwanted side effects associated with its charge. 
Other atoms, such as sodium or potassium, can also be used for this purpose. 


Finally, we remark that the experiments just described are really ‘thought 
experiments’: for technical reasons it is quite hard to do experiments of exactly 
the kind described here, particularly with three Stern—Gerlach magnets in series. 
However, the basic phenomena described here have been abundantly verified with 
experiments of one kind or another. 


3.1.2 The quantum mysteries of spin 


The series of thought experiments outlined above bring out the key points that a 
quantum theory of spin must explain. Developing such a theory is the challenge 
taken up by the rest of this chapter, but first it is worth looking back at the 
experiments to note their characteristic quantum features. 


We first correct an impression that might have been left by the way we have drawn 
the figures. From Figure 3.1 onwards, we have shown diverging paths along 
which the atoms travel. This must not be taken too literally! Quantum mechanics 
has done away with the notion of a ‘trajectory’. An atom is not an ‘up’ atom or a 
‘down’ atom until it has actually been measured to be so. Just as a particle that 
has passed through a slit does not ‘decide’ where to materialize in a diffraction 
pattern until it reaches a detecting screen, so an atom does not ‘decide’ whether it 
has been deflected up or down until it has been detected. Once the atom has been 
detected at a particular point, it is as if it had followed a path to that point. 


The first key point is that the Stern—Gerlach experiment, carried out with silver 
atoms, produces just two regions on the detecting screen where atoms appear. 


3.2 Representing spin states 


This means that the magnetic dipole moment of a silver atom (which is also the 
magnetic dipole moment due to the spin of a single electron) cannot be due 

to orbital angular momentum; if it were, there would be an odd number of 
components. Instead, we assume that the magnetic dipole moment is due to a new, 
non-classical type of angular momentum, called spin. Even if an electron were at 
rest at a single point in space, it would possess spin, together with an associated 
magnetic dipole moment. Spin is therefore regarded as an intrinsic property of an 
electron and, for this reason, it is sometimes called intrinsic angular momentum. 


The other key point is that a spin preparer creates a beam of atoms, all in a state 
with a definite value of a given spin component — say, with S, = +h/2. We say 
that S, has a definite value because every measurement of S, gives the value 
+h/2. If we measure the spin component in some other direction, we always 
get either +//2 or —h/2, but we cannot say which of these two outcomes 

will occur for any single atom. Instead, when large numbers of atoms pass 
through the apparatus, we can use the cos*(@/2) rule to predict the proportion of 
measurements that give +//2 and the proportion that give —h/2. 


Spin was proposed by Goudsmit and Uhlenbeck in 1925. Their evidence was 
based on a detailed analysis of the spectral lines of hydrogen atoms in a magnetic 
field, but they were also inspired by Pauli’s famous exclusion principle. Pauli had 
found that the regularities of the Periodic Table could be explained if a maximum 
of two electrons were allowed to occupy each atomic orbital. He referred to this 
as a ‘classically non-describable two-valuedness’, but was reluctant to interpret it 
in terms of spin. 


A measure of how surprising spin seemed at the time is provided by the story that 
the great Lorentz tried to dissuade Goudsmit and Uhlenbeck from publishing on 
the grounds that he had calculated that a spinning electron, with the correct 
magnetic dipole moment, would have a surface speed greater than that of light. 
Uhlenbeck was about to withdraw the paper, but it had been sent to the journal 
already! Lorentz’s model of an electron proved to be incorrect; the quantum 
concept of spin cannot be modelled by anything so classical as a spinning ball. 
Perhaps by now we should be used to the fact that quantum mechanics springs 
surprises on us. Spin is a form of angular momentum, but is not one that can be 
pictured in classical terms. 


Finally, we note that the theory we shall develop applies to any type of particle 
whose spin components have two possible values, +h/2 and —h/2. For reasons 
that will emerge later, particles with this property are called spin-4 particles. The 
family of spin-5 particles is large and important, and includes electrons, muons, 
protons, neutrons and quarks. However, some particles (such as photons) have a 
different type of spin, and they are not described by the formalism of this chapter. 


3.2 Representing spin states 


We shall assume that it makes sense to talk about the spin state of a particle, 
independently of its other properties. When we do this, it is obvious that spin 
states cannot be represented by ordinary wave functions, which are functions of 
position. Amongst other things, a wave function tells us the probability density 
for finding a particle at different points in space. But this has nothing to do with 
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the spin state of an electron. So far as we know, electrons are point-like entities; 
their spin at each instant has a direction, but no spatial extent. 


Fortunately, Chapter 1 gave us an alternative way of representing quantum states, 
using vectors in an abstract vector space. This notation gave us a compact way of 
writing the equations of wave mechanics, and it also suggests a different way of 
thinking about quantum mechanics. You may recall that a wave function Y (x, t) 
corresponds to a state vector |W) in function space — an infinite-dimensional 
vector space with an inner product defined by 


l= | FE gle) ae, 6.1) 


In a system such as a harmonic oscillator, the energy eigenfunctions y(x) 
provide an orthonormal basis for function space. The energy eigenfunctions are 
orthonormal because 


(Wiles) = f WE Yle) dr = by, (3.2) 


and they provide a basis for function space because any state vector |) can be 
expressed as a linear combination of the eigenvectors |y): 


(oe) 
|) = So aili). 
i=0 
This sum generally contains an infinite number of mutually orthogonal 
eigenvectors |~;), which is why function space is said to be infinite-dimensional. 
The coefficients a; are interpreted as probability amplitudes: |a;|? is the 
probability that an energy measurement in the state |W) will give the ith energy 
eigenvalue, Fi. Because one or other of the energy eigenvalues must be obtained, 
we have 
Co 
a eee 
i=0 
You will see that spin states can be described in a similar way. We shall suppose 
that the spin state of a spin-4 particle can be represented by a vector in an abstract 
vector space, which we shall call spin space. 


3.2.1 Representing spin states by vectors 


You have seen that a spin analyzer oriented in the z-direction measures the 
z-component of spin, with two possible values: S, = +h/2 and S, = —h/2. 
There is a special spin state that is certain to give the value S, = +h/2. We call 
this the spin-up state relative to the z-axis, and represent it by the vector | 1+). 
Another special spin state is certain to give the value S, = —h/2. We call this the 
spin-down state relative to the z-axis, and represent it by the vector | |+). 


The possibilities of an atom being ‘spin-up’ or ‘spin-down’ are mutually 
exclusive, so to is natural to assume that | },) and | |.) are orthogonal to one 
another in spin space. We shall also assume that these vectors are normalized, 
although the interpretation of ‘orthogonal’ and ‘normalized’ will be left rather 
vague for the moment. 


3.2 Representing spin states 


There are many possible spin states, which can be produced by spin preparers 
oriented in arbitrary directions. However, we shall assume that: 


Any spin state |A) can be written as a linear combination of | 1+) and | |+), 
so 


|A) = ay| Tz) a a2| las (3.3) 


where a, and az are complex numbers. 


In other words, | 1+} and | |.) provide an orthonormal basis for spin space. 
Because any spin state | A) can be written as a linear combination of just two basis 
vectors, we say that the spin space of a spin-4 particle is two-dimensional. 


We shall assume that the coefficients a, and ag have the usual 
quantum-mechanical interpretation of being probability amplitudes. When a 
silver atom in the state |A) of Equation 3.3 enters a spin analyzer and has the 
z-component of its spin measured, the probability of getting +h/2 is |a1|?, and 
the probability of getting —f/2 is |a2|?. Since these probabilities must sum to 
one, we require that 


la|? + lao}? = 1. (3.4) 
Exercise 3.4 A spin preparer produces atoms in the spin state 
|4) = 


(a) If a single atom is prepared in the state |A}, what prediction can be made 
about the result of measuring S, for this atom? 


(b) If a million atoms are prepared in the state |A), what prediction can be made 
about the results of measuring S, for this collection of atoms? | 


We said earlier that the basis vectors | },) and | |.) are ‘orthogonal to one 
another’ and ‘normalized’. Terms such as these are part of a wider issue: how do 
we define an inner product between vectors in spin space? We cannot use 
Equation 3.1 because spin states are not described by wave functions. However, 
we can proceed as follows. 


First, we shall use the familiar bra-ket notation of Dirac. Given two vectors |A) 
and |B) in spin space, we denote their inner product by (A|B). Then we express 
the fact that the basis vectors are normalized and orthogonal to one another by 


writing 
(Tz | Tz) ra (le | tz) =1, (3.5) 
ez | la = (Le | i = 0. (3.6) 


We shall assume that the inner product in spin space obeys similar rules to the 
inner product in function space. This is a reasonable assumption, based on the 
belief that there is an underlying unity to quantum mechanics. 
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In particular, we shall assume that the inner product of |A) = a| 12) + a2] |) 
with |B) = bı| 12) + b2| |.) can be evaluated as follows. We write 


(AIB) = (ai (Te | + a3(Lz |) (bal 12) + bal Je), 


where the complex conjugates in the first set of round brackets are a typical 
feature of taking inner products in complex vector spaces. Then, multiplying out 
the brackets gives 


(A|B) = aïbı (T= | 12) + apba(Tz | Le) + agbi (Lz | 12) + agba(Le | Lz). 


Finally, there is a great simplification because the basis vectors | [,) and | | ~) are 
normalized and orthogonal, obeying Equations 3.5 and 3.6. We conclude that 


(A|B) = aiby ar ab. G 


This equation tells us how to evaluate the inner product of vectors in spin space. 
In fact, its derivation follows the same lines as Worked Example 1.1 in Chapter 1. 
The only difference is that vectors in spin space have only two components, while 
vectors in function space have an infinite number of them; in many ways, the 
quantum mechanics of spin is much simpler than wave mechanics. 


In the special case where |B) = |A}, Equations 3.7 and 3.4 combine to give 
(A|A) = ažaı + aša = |a|? + |a|? = 1. 


Any spin state vector must be normalized in this way. If a spin state vector is not 
correctly normalized, we must multiply it by a suitable normalization constant. 


The result of Exercise 3.5 is Exercise 3.5 Use Equation 3.7 to verify that (A|B)* = (B|A) for vectors in 
important and will be used later. spin space. 


Exercise 3.6 Find normalization constants so that the following three spin 
vectors are normalized, and write out the corresponding normalized forms: 


(a) |Tz)+ Le), © | Tz) Fille), © 5] Tz) = 12] l2). ai 


We generally choose the normalization constant to be real and positive, but this is 
done only for convenience. You may recall that a wave function Y (x,t) can be 
multiplied by an overall phase factor e'*, where a is real, without making any 
difference to the state being described. The same is true for spin states. This 
means that two spin vectors which look quite different may actually describe the 
same state. For example, the vectors 

= 5 (L12) +i l.)) and |B) = Z t +l) G63) 
describe the same spin state because |B) = —i|A). Note carefully that while 
overall phase factors (multiplying all terms in a spin vector) make no difference, 
the phases of individual terms generally do matter. For example, the vectors 


1 es 
= vail tz) +i] l2)) and IO) = Fill tz) +[12)) (3.9) 


correspond to different states because |C’) is not a multiple of |A). 


|A) 


|A) 


Exercise 3.7 Show that |C’) in Equation 3.9 cannot be expressed as a multiple 
of | A). E 
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3.2 Representing spin states 


3.2.2 Representing spin states by matrices 


We now introduce an alternative representation of spin states which simplifies Section 8.3 of the Mathematical 
many calculations. We represent | 1+) and | |.) by the following column matrices: toolkit reviews matrices. 
1 0 
= lo] and 11)= fil 6.10) 


These matrices have two elements because spin space is two-dimensional. Note 
that we have chosen very simple matrices to represent states that are spin-up or 
spin-down relative to the z-direction. There is nothing special about the z-axis, 
and it can be chosen to point in any direction in space, but spin states with definite 
values of S, are always represented by the matrices of Equation 3.10. This 
convention is universally accepted and should not be broken. 


Any vector |A) in spin space can be expressed as a linear combination of | 1+} and 
| 12). This means that we can express | A) in Equation 3.3 as 


|A) = ay o +a H = H l (3.11) 


So any spin state of a spin-5 particle can be represented as a two-element matrix, 
which is called a spinor. 


The mathematics of matrices fits exactly the manipulations we need to carry out 
on vectors in spin space. For example, given two spinors 


a= fe] ana 3) = |?) 


and a constant k, we can use the rules of matrix algebra to write 
ay by ay + by 
A B = = 
| ) T | ) H F H j =F a 


and 


These equations make good sense because the vectors on the left-hand sides are 
correctly represented by matrices on the right-hand sides. 


We can also write the inner product of two vectors in matrix form. To do this, we 
first recall that 


(A|B) = aïbı + a3b2, (Eqn 3.7) 


and then note that 


aïbı + a3b2 = [az a3] H ; 


where the product on the right-hand side involves matrix multiplication (going 
along the row of the first matrix, and down the column of the second matrix, 
multiplying corresponding elements and adding the results). We therefore see that 
the inner product of two spin vectors can be written in the matrix form 


(A|B) = [aj a3] A (3.12) 


The inner product (A|B) can be regarded as a simple joining together of a bra 
vector (A| and a ket vector |B). We can therefore identify the separate bra and ket 
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vectors as follows: 


(a= [aj a3) ana |B) = foi 


A ket spin vector is represented by a column spinor, and a bra spin vector is 
represented by a row spinor. To convert a column spinor into the corresponding 
row spinor, the rule is to turn the column into a row and take the complex 
conjugate of all elements, 


so if a= fe, then (A| = [a] ad]. (3.13) 
2 


Worked Example 3.2 
Consider the pair of spinors 


o=] = a-f] 


Use spinor notation to normalize these spinors, and find the corresponding 
normalized spinors, |C’) and |D). Show that |C) and |D} are orthogonal to 
one another (i.e. (C|D) = 0). 


Solution 
Using matrix multiplication, we have 


(elec) = [1 —i] H =1+1=2, 


(did) = [-i 1] Hl =I l=2 


The normalization factor can be taken to be 1/ 4/2 in both cases, and the 
corresponding normalized spinors are 


i il L fa 
|C) = — H and |) = E H ; 
To check that |C) and |D} are orthogonal, we must show that their inner 
product is equal to zero: 


1 l fa 1 

C|D) = — |1 —i| — = -(i—i) = 0. 

(C1) = lt -] [1] = 56-9 

There is no need to show explicitly that (D|C’) = 0, since we know that 
(D|C) = (C|D)* (see Exercise 3.5). 


Exercise 3.8 Show that the spin vectors 


o-a] e m-ai 


are orthonormal (i.e. (U|U) = (V|V) = 1 and (U|V) = 0). =] 
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Alternative pairs of orthonormal basis vectors 


The vectors | T,) and | |.) provide an orthonormal basis for spin space. They are 
orthonormal because 


(Te | Tz) =(elle)=1 and (fTz| le) = (lz | Tz) =9, 


and they provide a basis because any spin state can be written as 


|A) = ai| Tz) +l l2), 
where a 1 and az are complex numbers. 


Orthonormality, in this context, refers to inner products in spin space; it does not 
mean that | T+) and | |+} relate to perpendicular directions in real space. In real 
space, of course, | T,) and | |.) refer to opposite spins along the z-axis; in 
measurements of the z-component of spin, | },) always gives the value +h/2, and 
| |.) always gives the value —h/2. 


Now, there is nothing special about the z-direction. It is possible to prepare 
atoms in states that are spin-up or spin-down relative to any direction. For 
example, there is a spin-up state | Tx}, and a spin-down state | |x), relative to the 
x-direction. In measurements of the z-component of spin, | fs) always gives the 
value +//2, and | |z} always gives the value —h/2. 


The vectors | Tx) and | |.) provide an alternative orthonormal basis in spin space. 
This means that any spin state |A) can be written as 


|A) = bıl Tx) T b| lz), 


where the complex numbers bı and bz are interpreted as probability amplitudes 
for spin measurements made in the x-direction. When the x-component of spin is 
measured in the state |A), the probability of getting the value +//2 is |b|? and 
the probability of getting —h/2 is |b2|?. 


This idea can be generalized to spin measurements taken in any direction. If | tn) 
and | |n) are spin-up and spin down states relative to a direction n, we can write 
any spin state as 


|A) = ci] Tn) + €2| Ln), (3.14) 


and interpret |c;|? as the probability of getting the value +h/2, and |c2|? as the 
probability of getting the value —i/2, when the spin component in the direction n 
is measured. 


Section 3.1 described thought experiments in which silver atoms were prepared in 
specific spin states, and spin components along various directions were measured. 
The above discussion suggests that we might be able to explain the probabilities 
quoted in Section 3.1 by taking the modulus squared of coefficients such as cq 

or cg. But there is a difficulty. We must first write the given spin state in terms of 
spin-up and spin-down states corresponding to the orientation of the spin analyzer. 
We do not know how to find such states. The next section will show how this is 
done, using the quantum-mechanical operator that represents the spin component 
measured in a given direction (an observable quantity). 


Exercise 3.9 It turns out that the vectors |U} and |V} in Exercise 3.8 are the 
spin-up and spin-down vectors | },,) and | |+) relative to the x-direction. Use this 


79 


Chapter 3 Spin angular momentum 


In this chapter, we place hats on 
2 x 2 matrices to show that they 
act as operators. 


80 


fact to show that 


1 1 
Ite) = yal te) 


and hence show that when the x-component of spin is measured in the state | 1+), 
there is a 50% chance of getting the value S, = +h/2. 


| Te) — 


3.3 Spin observables in quantum mechanics 


In the previous section, we introduced spinors to represent the spin states of a 
spin-5 particle. This leaves a large gap in the formalism still to be filled. In wave 
mechanics, we have wave functions that represent the state of a system; we also 
have operators that represent various observables. For example, the Hamiltonian 
operator H represents the energy of the system. In the context of spin, the most 
important observables are the spin components Sr, Sy and S, of a particle. We 
therefore ask: what operators should be used to represent these observables in 
quantum mechanics? 


3.3.1 Matrices representing S,, S, and S, 


In wave mechanics, states are described by wave functions, and observables are 
represented by operators such as P, = —ihO/Oz or L, = —ind /O¢, which act 
on functions to give new functions. 

Spin states, however, are represented by column spinors (2 x 1 matrices), and spin 
observables are represented by 2 x 2 matrices that act on column spinors to give 
new column spinors. We seek the matrices Sos S, and S, that represent the spin 
components Sz, Sy and S,. We start with S.. 

From the Stern—Gerlach experiment, we know that a measurement of S, has two 
possible outcomes: +h/2 and —h/2. According to the general principles of 
quantum mechanics, this means that the eigenvalues of S, are th/2. We also 
know that the spin-up state | 1+) gives +h/2 with certainty, while the spin-down 
state | |.) gives —h/2 with certainty. This means that | 1+) is an eigenvector of §, 
with eigenvalue +h/2, while | |.) is an eigenvector of §, with eigenvalue —h/2. 
We therefore require that 


a = h 
S| Te) = 52 p = +5 A ; (3.15) 
ee 2 h 

S| 1.) = 8; H =-5 H (3.16) 


We are looking for a 2 x 2 matrix S, that satisfies both these equations. Denoting 
the elements of this matrix by a, b, c and d, Equation 3.15 gives 


e 101-20] 


Multiplying out the matrices on the left-hand side gives 


=z ll: 
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so a = h/2 and c = 0. Ina similar way, Equation 3.16 gives 


e alli}=-ah] een [i] =- [i 


leading to b = 0 and d = —h/2. Thus we have found that 


a All 0 
=g i ii ' 
The matrices representing Sy and S, are harder to establish. We shall not go into 


the details here, but will only briefly outline the clues that allow suitable matrices 
to be found. 


(3.17) 


First, recall that spin is assumed to be a type of angular momentum. This 
assumption is supported by the fact that charged particles with spin have magnetic 
properties similar to those of orbiting charges. Now, we know that the orbital 
angular momentum operators eee Èy and L z satisfy the commutation relations 


LoL, — LyL, = ihl, (Eqn 2.39) 
LL, le ly snes (Eqn 2.40) 
Lila — LoL; =i ls (Eqn 2.41) 


We therefore assume that the spin matrices Sz, Sy and S obey similar 
commutation relations : 


S5, — SyS2 = ihS,, (3.18) 
S5- — 8-5, = ih Sz, (3.19) 
8.8, — S28- = ih S, (3.20) 


Given Equation 3.17 for S., these equations place strong restrictions on S, 

and S;. In addition we know that the eigenvalues of all three matrices are +h/2. 
Finally, because Sa, S, and S, represent observable quantities, they must behave 
as linear Hermitian operators when they act on spinors. 


The algebra involved in imposing these requirements is straightforward but 
tedious; we skip to the final result. The three observables Sz, Sy and Sz, 
corresponding to spin components measured along the x-, y- and z-directions, can 
be represented by the following set of matrices: 


a PO 1 a AjO —i a All 0 
<2 J g-20 3-20 9. can 
Exercise 3.10 Show that the spin matrices So. S, and S; do satisfy 
Equation 3.18. 
Evers Sil hovia agais p T H ith 
xerci : ow tha as eigenvectors — |. | and — , Wi 
y g V2 li V2 [1 


corresponding eigenvalues +f/2 and —h/2. What is the physical significance of 
these results? 4 


Section 8.3.4 of the 
Mathematical toolkit shows that 
the square matrix A acts as a 
Hermitian operator only if 

Aij = Aj, for all i and j. Any 
matrix satisfying this condition 
is said to be a Hermitian 
matrix. 
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Figure 3.7 A unit vector n 
with polar angle 0 and azimuthal 
angle ¢ in spherical coordinates. 
The vector n has Cartesian 
components ny = sin 0 cos ¢, 
Ny = sin @sin ¢ and n, = cos 6, 
which are the x-, y- and 
z-coordinates of the tip of the 


arrow representing n. 


Remember: 


eti? — cos ġ 4 
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3.3.2 Spin in an arbitrary direction 


By adjusting the orientation of a spin analyzer, we can measure the component of 
spin in any direction — not just along the x-, y- and z-axes. Such a general spin 
component is an observable quantity, and is represented by a 2 x 2 matrix. We 
shall obtain an expression for this general spin matrix. We shall then be able to 
explain what happens in Figures 3.5 and 3.6, where the directions chosen for 
preparing and analyzing spins are related by an arbitrary angle. 


In classical physics, given a unit vector n, we define the component of a vector S 
along n to be 


Shn = n» S = S cosg, 
where æ is the angle between the direction of S and the direction of n. 


The vector n can be specified in terms of the polar and azimuthal angles of 
spherical coordinates. The geometry of Figure 3.7 gives 


n = sin ĝ cos ge, + sin 0 sin de, + cos 0 ez. 


So, if spin were a classical vector S, its component Sn in the direction of the unit 
vector n would be 


Sn =n-S = sin 0 cos ọ Sz + sin 0 sin ọ Sy + cos 0 Sz. 


To obtain the corresponding quantum-mechanical operator, we replace the 
classical observables Sz, Sy and S, by the corresponding spin matrices. Using 
Equation 3.21, this gives 


2 ai. o1] ccs pate f0 =i I 0 
Sn = 5 (sindoos |? o| tsindsino |; 5] +2804 ele 


which can be simplified to give 


Ex SO 
S h | cosO œ a] l (3.22) 


2 2 le!%sin@ —cosé 


This is the matrix form of the operator that represents the component of spin in 
the direction of the unit vector n shown in Figure 3.7. We call it the general spin 
matrix. 


In the Stern—Gerlach experiments of Figures 3.1-3.6, the beam points along the 
y-axis. This ensures that all the spin preparers and spin analyzers are oriented in 
the xz-plane, with ¢ = 0. In the special case where @ = 0, Equation 3.22 reduces 
to 


h . 
` O sind | (restricted form for ¢ = 0). (3.23) 


Sn = 2 |sinf —cosé 


This is the spin matrix for a direction n that lies in the xz-plane and makes an 
angle 0 with the z-direction. The simpler form of Equation 3.23 is the main 
reason we chose to have atoms incident in the y-direction in our discussions of 
Stern—Gerlach experiments. 


The general spin matrix has two eigenvalues, +//2 and —h/2. These are the 
possible values of a spin component in any given direction. Corresponding to 
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these two values, there are two eigenvectors, which we denote by | Tn) and | Jn). 
In other words, we have 


a h 
Snl Tn) = +5! Tn), 


a h 
Sn| ln) _ E. {n)- 


If the spin component in the n-direction is measured, the state | În) is certain to 
give the value +//2, while the state | |n) is certain to give the value —h/2. We 
say that | Tn) is spin-up and | |n) is spin-down, relative to the n-direction. 


The eigenvectors | În) and | |n) can be represented by specific spinors. At this 
point, we simply state the results (for more details, see Exercise 3.14 below). For 
a general direction n, the spin-up and spin-down eigenvectors are 


s(0 -e sini 
| Tn) = Basan Me | | ne | i a 


These two vectors provide an orthonormal basis for spin space. In other words, 
any spin state |A) can be written as 


|A) = ci] Tn) + c2] Ln), 
where cı and c3 are complex numbers. 
Exercise 3.12 Verify that | În) and | |n) are normalized and orthogonal to 
each other. 
Exercise 3.13 Find spinors corresponding to: 
(a) an eigenvector of S, with eigenvalue +//2; 
(b) an eigenvector of S, with eigenvalue —h/2; 


(c) a state that definitely has +h/2 along an axis that lies in the xz-plane and 
makes angles of 60° and 30° with the positive z- and x-axes. 


Exercise 3.14 Verify that | Tn), as given in Equations 3.24, is an eigenvector 
of Sn with eigenvalue +//2. a 


3.3.3 Compatible and incompatible spin observables 


In Section 2.5 we met compatible and incompatible observables of orbital angular 
momentum. Here we shall see that very similar ideas apply to spin. 


Ifa spin-4 atom is in a state | Tn), then a measurement of its spin component in 
the positive n-direction will certainly give the value +h/2, and a measurement of 
its spin component in the negative n-direction will certainly give the value —h/2. 
However, if we measure spin in some other direction, for example the z-direction, 
we will not be able to predict the outcome with certainty; this is because the 
eigenvectors of Sn are a superposition of | 1+) and | |). Obviously, this result 
cannot depend on our choice of axes, so we can make a general statement: 
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If a particle is in a state with a definite value of the spin component in a 
given direction n, it does not have a definite value of the spin component in 
any other direction (except for the opposite direction, —n, where the sign of 
the spin component is reversed). 


Spin components along different axes in space are incompatible observables. 
Incompatibility is linked to the fact that Ses S, and S. do not commute with one 
another. If, for example, S, and S, did commute, it would be possible to find a 
complete set of vectors that are eigenvectors of both S z and Sas corresponding to 
states in which both S, and Sy have definite values. However, these simultaneous 
eigenvectors do not exist because Equation 3.20 shows that S.S, £ 8,8.. 


This situation follows the general pattern of the last chapter, where you saw that a 
particle cannot simultaneously have definite values of L, and L, (unless both are 
zero). But in that chapter there was an operator, 

a2 22 a2 a2 

E =b tL + lg, 
representing the square of the magnitude of the orbital angular momentum, which 


did commute with the three operators Lz, L and E for the components of orbital 
angular momentum. Thus, L? could have a definite value at the same time as L,. 


The same is true for spin in the sense that the operator defined by 
a2 Z a2 g 
S = S, +5, + S,, 
representing the square of the magnitude of spin, commutes with the three 


matrices Sz, Sy and S, for the components of spin. To see why, we first note that 
it is easy to verify that 


a2 g fe l l-4 


g,=8,=S. L (3.25) 


where Tis the 2 x 2 unit matrix. 


@ Verify Equation 3.25 for the case of Sp. 
O Using Equation 3.21 and multiplying out the matrices gives 


g O 1]/O 1] _ he [1 Oo} _ he 
z 4j of of 401i) a 
Using Equation 3.25, we see that 
3e 
a =e. £6. oe he (3.26) 


If we now apply g to an arbitrary spinor, we obtain 
g? fa] _ 3h? f1 0 a| _ 3R fa 
bj 4 (0 lj fb} 4 [bf 
so any spinor is an eigenvector of > with eigenvalue 3}? /4. This means that any 


state of a spin-4 particle has S? = 3? /4, and we can say that S, and S? are 
compatible observables — they can both have definite values in the same state. 


3.4 Predicting the results of measurements 


You may recall from Chapter 2 that the allowed values of L? are I(l + 1)h?, where land m are called the orbital 
1=0,1,2,..., and the allowed values of L, are mh, where m is an integer angular momentum quantum 
ranging from —/ to +l. number and the magnetic 
quantum number; s and Ms are 
called the spin quantum 
number and the spin magnetic 
quantum number. 


Now for spin we have something very similar. If we write S$? = s(s + 1)h?, we 
see that the particles described in this chapter have s = 1/2, which gives 


S? = 3(4+41)h? = 3h? /4, 
and the allowed values of S, are m,h where m, = +4. Here, at last, is the reason 
we refer to electrons as being ‘spin-4 particles’. 


3.4 Predicting the results of measurements 


In this section, we show how the formalism developed so far can be used to 
predict the outcomes of spin measurements, including Stern—Gerlach experiments 
of the type discussed in Section 3.1. As always in the quantum world, there are 
limitations to what can be predicted with certainty. 


3.4.1 Some general remarks about spin measurements 


Before carrying out calculations, we first make some general remarks about spin 
measurements and their interpretation, referring back to experiments using 
Stern—Gerlach apparatus (Figure 3.8). 


fraction 
cos*(/2) 


fraction 
sin? (6/2) 


Figure 3.8 An example of an experiment involving Stern—Gerlach magnets. 


We have already stressed that it is incorrect to think of an atom as following a 
definite trajectory through a Stern—Gerlach apparatus. We should not say that an 
atom is spin-up relative to a given direction until we have actually measured it to 
be so. Rather than thinking of a beam of atoms as being split by a Stern—Gerlach 
magnet, with some atoms going one way and other atoms the other way, we 
should think of each atom as being in a linear superposition of two states. The spin 
component of an atom along a given axis is undetermined until it is measured. 


A number of questions come to mind. What constitutes a measurement? When 
does the measurement take place? What does the measurement do to the state of 
the particle? 
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The spin analyzer in Figure 3.8 lets atoms fall on a detecting screen. If a particular 
atom is detected as arriving in the upper trace on the screen, this means (for our 
choice of magnets) that the atom is spin-up immediately after the measurement. 
Before the measurement, the atom is in a linear superposition of states that are 
spin-up and spin-down relative to the orientation vector of the magnet. This 
linear superposition changes abruptly into a spin-up state the instant the atom is 
recorded in the upper trace. It is also possible for the atom to arrive in the lower 
trace, and the linear superposition then changes abruptly into a spin-down state at 
the instant of detection. 


The abrupt change from a linear superposition state to a spin-up or spin-down 
state is a characteristic feature of quantum mechanics. You met something similar 
at the beginning of Book 1: an extended wave function, describing a diffracted 
particle, collapses down to a single pixel when the particle is detected on a screen. 
In Book 1, this was called the collapse of the wave function. Here, we shall use a 
more general term — the collapse of the state vector, reflecting the fact that spin 
states are described by vectors, not functions. 


If this collapse seems mysterious, it is because it is mysterious. There is no 
satisfactory description in terms of any underlying mechanism. There is a great 
deal of controversy about this topic. It does seem, however, that the collapse of 
the state vector occurs when there is an irreversible event such as a particle 
colliding with a screen and leaving its mark. Without such an event, collapse 
does not take place. For example, we can imagine guiding the two beams in a 
Stern—Gerlach analyzer and allowing them to merge together, well before they are 
detected. If this were done, quantum mechanics predicts that there would be no 
collapse, and that the final state emerging from the Stern—Gerlach apparatus 
would be the same as that entering it. In other words, it is not the Stern—Gerlach 
magnets that cause the collapse, but the detectors. 


We assume that the state produced immediately after a measurement is a state that 
is certain to give the value found in the measurement. This makes good sense 
because if we carry out the same measurement twice, in very rapid succession, we 
would expect to get the same results in both cases. In terms of the formalism we 
have developed, this means that: 


When a spin component in the n-direction is measured, the spin state 
collapses onto an eigenvector of Sn — the one whose eigenvalue is the value 
obtained in the measurement. For example, if we measure Sy and obtain the 
value —h/2, the spin state collapses onto | |), which is the eigenvector of 
Sa corresponding to the eigenvalue —f/2. 


You might wonder how we can investigate the state of an atom after a spin 
measurement. In a spin analyzer, the measurement takes the atom out of the 
beam, making it unavailable for further measurements. However, it is also 
possible to think of a spin preparer as a kind of measurer. 


Suppose that a spin preparer blocks off particles with S, = —h/2 and allows 
particles with S, = +h/2 to pass though. We consider an atom that is in a linear 
superposition of spin-up and spin-down states before it enters the preparer. If this 
atom enters the preparer, and we do not detect it in the blocked-off path, we can 
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infer that it has S, = +A/2, and that the initial superposition of spin-up and 
spin-down states has collapsed onto a spin-up state. In effect, this simulates a 
measurement of S, — by a method reminiscent of the Sherlock Holmes story, 
where a vital clue was that a dog failed to bark. Having passed through the 
preparer, the atom is still in a beam and further measurements can be carried out 
on it. We can confirm that it does have S, = h/2 after the measurement, and this 
is exactly what the apparatus sketched in Figure 3.3 would do. 


3.4.2 Calculating probabilities 


The result of sending an atom through a Stern—Gerlach apparatus is uncertain; for 
most initial states, the atom could emerge in either the spin-up beam or the 
spin-down beam. We now show how the probabilities of these two outcomes can 
be calculated. You will see, in particular, how the quantum theory of spin explains 
the cos?(6/2) rule of Section 3.1. 


Suppose that a spin analyzer is oriented in the n-direction, so that it measures Sn, 

the spin component in the n-direction. The corresponding spin matrix Sn has 

eigenvectors | În) and | |n}, and these provide an orthonormal basis for spin 

space. This means that any spin state |A) can be written as The subscript u stands for 


B spin-up, while d stands for 
|A) = aul În) + aal In), (3.27) spin-down, relative to the 


where the complex numbers a, and aq are the probability amplitudes for direction n. 
measuring spin-up or spin-down in the n-direction; the corresponding 
probabilities are |ay|? and |aq|?. 


If the orientations of the spin preparer and the spin analyzer are known, we can 
easily find appropriate spinors for |A), | fn) and | Ln). The remaining problem is 
to find the coefficients au and ag. We shall now present an efficient way of doing 
this. 


To find a,, we take the inner product of both sides of Equation 3.27 with | Tn) to 
get 


(Tn |4) = @u(Tn | Tn) + aa(Tn | Ln). 


The eigenvectors are normalized and orthogonal, so (Tn | fn) = 1 and 
(Tn | {n) = 0, and we are left with 


ay = (În |A). (3.28) 
A similar argument, based on taking the inner product with | |n}, gives 
aa = (ln |4). (3.29) 


The corresponding probabilities are then given by 


Pu laul? = |(Tn |.A)|? (3.30) 
pa = |aal? = {In |A)/? (3.31) 


The only remaining step is to evaluate the inner products with the appropriate 
spinors. To illustrate this process, let us suppose that the orientation vector of the 
spin preparer makes an angle 0; with the z-direction and has ¢; = 0. Let us also 
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suppose that the spin analyzer is oriented in the direction of the vector n, which 
makes an angle 92 with the z-direction and has ¢2 = 0. Then, inserting these 
angles into Equation 3.24, we see that the initial state is 


a) = La) 


and the eigenvector characterizing the measurement result is 
lta) = pean l 
sin(02/2) 
We therefore have 
(tm 14) = [c0s(02/2) sin(0a/2)) astua] 
Standard trigonometric identities cos(2/2) cos(01/2) + sin(82/2) sin(81/2) 


are listed inside the back cover = cos ((02 _ 61)/2), 
of the book. 


and the probability of measuring spin-up in the n-direction is 
Pu = |(Tn |A)|? = cos? ((82 — 01)/2), 


where 02 — 6, is the angle between the orientations of the preparer and the 
analyzer. So our formalism does indeed explain the cos?(@/2) rule, which was 
stated without proof in Section 3.1. 


Exercise 3.15 Express | 1+} as a linear combination of | },,) and | |,,), and 
hence find the probability of getting the values +h/2 and —h/2 when Sy is 
measured in the state | 1+). | 


3.4.3 Calculating expectation values 


Now that we know how to calculate the probabilities of getting particular 
outcomes in a spin measurement, we can find the expectation value of a given spin 
component in a given state. 


Suppose that a spin-5 particle is in a spin state represented by |A), and that we 
measure its spin component in the direction n. The possible values are +h/2 and 
—h/2, and the corresponding probabilities are p, and pg. We then define the 
expectation value of Sn in the state | A) to be 


(Sn) = Pu (5) + Pa (-5) ; (3.32) 


=\(tm lav (S) -a AVP). 6.33) 


This formula can be used directly, with pu and pq, the spin-up and spin-down 
probabilities, found by the methods described in the previous subsection. 


However, there is another way of calculating the expectation value of a spin 
component, analogous to the sandwich integral rule of wave mechanics. In wave 
mechanics, the expectation value of momentum is given by 


oO 


(pz) = (UIP, Iv) = I regrteji 


—00 
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A similar sandwich rule applies in the case of spin. It can be shown that 
(Sn) = (A| Sn |4), (3.34) 


where (A| and |A) are the bra and ket vectors representing the spin state of the 
particle. To evaluate the expectation value, we express the bras and kets as row 
and column spinors, use an appropriate spin matrix Sn, and multiply out all the 
matrices. The next example shows why this works. 


Worked Example 3.3 Essential skill 


Show that Equations 3.32 and 3.34 are equivalent. Evaluating expectation values of 


spin observables using spinors 
Solution 


We will start with the expression of Equation 3.34 and show that 
Equation 3.32 follows. First, we express |A) as a linear combination of the 
eigenvectors of Sn: 


|A) = ay| Tn) + @al In). (3.35) 


The corresponding bra vector is found, as usual, by replacing all the kets by 
bras and taking the complex conjugates of the coefficients. This gives 


(A| = aù (fn | + @a(Ln |- 


Now, | Tn) and | |) are eigenvectors of Sn with eigenvalues +/2 and 
—h/2, respectively, so we have 


Sy |A) = - (au Tn) — aal i)i 


Combining the last two expressions, we obtain 


(Sp) = (A|8n 4) a — al dn)), 


which can be simplified, using the orthonormality of | În) and | Jn), to 


joes ' h 
(Sn) = 5 (aiau — aaa) = 5 (laul? — laal?) : 


Finally, we can write this in the form of Equation 3.32: 


(Sa) = pa (2) +a (-5), 


since py = |ay|? is the probability for spin-up in the n-direction and 
Da = laal? is the probability for spin-down in the n-direction. 


Exercise 3.16 Find the expectation value of S, for the state in which the 
particle definitely has spin S, = +h/2. Do the calculation two ways, first using 
Equation 3.34, then using Equation 3.32. a 
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3.5 Energy levels and time-development 


A spin-5 particle, such as a silver atom, behaves like a tiny magnetic dipole. That 
is why silver atoms are deflected by the inhomogeneous magnetic field of a 
Stern—Gerlach magnet. In the simpler case of a uniform magnetic field, a spin-3 
particle will have a magnetic potential energy that depends on the orientation of 
its spin relative to the magnetic field. 


In this section, we will use these ideas to obtain a Hamiltonian operator that 
describes the energy associated with spin orientation. We will then address two 
important questions about a spin-4 particle in a magnetic field: 


e What are its energy levels? 
e How does a given spin state evolve in time? 


These questions will be answered by writing down and solving the 
time-independent Schrödinger equation and the Schrödinger equation for a spin-5 
particle in a magnetic field. 


3.5.1 The Hamiltonian operator and energy levels 


Chapter 2 explained that there is a relationship between the orbital angular 
momentum L of a charged particle, and the resulting magnetic dipole moment pu: 
w= YL, 


where the proportionality constant ~y is the gyromagnetic ratio. This constant is 
proportional to the charge-to-mass ratio of the orbiting particle, and is negative for 
an electron because of its negative charge. 


Something very similar applies to spin. In this case, we have 
H = %5, (3.36) 


where the proportionality constant ~% is called the spin gyromagnetic ratio. The 
value of this constant depends on the type of particle, and is negative for an 
electron. In general, y Æ ¥. 


The classical expression for the potential energy of a magnetic dipole, of magnetic 
dipole moment m, in a magnetic field B is 


Emag = -H B. (3.37) 
Combining Equations 3.36 and 3.37, we obtain 
Emag = —Ys9 - B. (3.38) 


Let us suppose that the magnetic field has magnitude B and points in the direction 
of the unit vector n. Then B = Bn, and Equation 3.38 takes the form 


Emag = -y Bn - S = -yB Sn, (3.39) 
where Sn is the spin component in the direction of the magnetic field. 


Equation 3.39 is the energy contribution associated with the alignment of the spin 
relative to the magnetic field. A real particle would have other forms of energy, 
including kinetic energy, but we shall assume that these do not influence the 
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behaviour of its spin, and so can be omitted from our discussion. Equation 3.39 is 
a classical Hamiltonian function which contains no kinetic energy terms. We can 

obtain the corresponding Hamiltonian operator H by replacing the observable Sn 

by the corresponding general spin matrix, Sn. This gives 


ysBh | cos@ e`}? sin 4 


=e == (3.40) 


2 |e?sin@ —cosd 


In the special case where the magnetic field points along the z-axis, for example, 
we have 


a < YBh{1 0 
H = -yB S; = — 7 i ah 


In general, the quantity H defined in Equation 3.40 is called the Hamiltonian 

matrix. The equation shows that the Hamiltonian matrix H is proportional to the 

general spin matrix Bas where n is a unit vector in the direction of the magnetic 

field. Hence H and Sy, share the same eigenvectors, | Tn) and | |n), and we can 

write The subscript u stands for 
spin-up, while d stands for 


TW _ > _ A 
H] Tn) = ) and Hl dn) = + j GAD spin-down, relative to the 
In other words, direction n of the magnetic field. 
H| Tn) = Fal Ta) and H| ln) = Eal la) (3.42) 
where 
B Bh 
DE a 1 gd: e +75 (3.43) 


Equation 3.42 is just the time-independent Schrédinger equation, written out for 
both of its solutions. The energy eigenvectors are | În) and | |n), and the energy 
eigenvalues are Fu and Ea. 


We therefore have two energy levels, Eu and Ey. Which of these is lower in 
energy depends on the sign of +. For particles such as electrons, ys < 0 so 

EF, > Ea. Other particles, such as protons, have y > 0, and in this case Eu < Ea. 
In both cases there are two energy levels, separated by 


= |y| BA = fw, 
where w = |7,|B is a positive quantity called the Larmor frequency. The 
lower level is — ñw /2 and the upper level is +/Aw/2, but which of these levels 


corresponds to spin-up or spin-down relative to the direction of the magnetic field 
depends on the particle involved, as shown in Figure 3.9. 


hw 
| tn) Bu = i | Ln) Ea = +5 
hus 
hr hus 
sea oS Sa: | Tn) ame 
ais <0 Ys > 0 
e.g. electrons e.g. protons 


(a) (b) 


Figure 3.9 Energy levels for a spin-4 particle in a magnetic field: (a) particles 
such as electrons with ys < 0; (b) particles such as protons with y, > 0. 
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These ideas have important applications. A magnetic resonance imaging (MRI) 
scanner applies a strong magnetic field to ensure that the spin-up and spin-down 
states of protons in living tissue have slightly different energies. It then uses a 
radio frequency pulse to transfer protons from the spin-up state to the spin-down 
state. By monitoring the subsequent behaviour of the protons, it is possible 
produce detailed maps of internal body organs (Figure 3.10). 


Figure 3.10 An MRI image 
showing a vertical ‘slice’ 
through a human head. 


Exercise 3.17 A proton has a spin gyromagnetic ratio qs = 4.26 x 10’ Hz T+. 
It is placed in a magnetic field of magnitude 3.00T. What frequency of 
electromagnetic radiation is required to promote a proton from the spin-up state to 
the spin-down state? E 


3.5.2 Time-development of spin states 


Finally, we discuss the time-development of spin states in a magnetic field. This is 
also an important part of the way an MRI scanner works, as it gives us the ability 
to monitor the protons after they have been excited by a radio frequency pulse. 


To predict how states evolve in time, we use Schrödinger’s equation. For a spin-5 
particle, the Hamiltonian operator is a 2 x 2 matrix, and it acts on a 2 x 1 spinor. 
Nevertheless, Schrédinger’s equation still applies and it takes the form 


dA) x 


where | A) is the spin state of a particle at any time t. Now, suppose that we know 
the initial spin state, |A);initia at time t = 0. How can we predict the spin state 
|A) at some future time t? The secret is to represent spin states in a suitable basis 
— one that simplifies the calculation. The ‘right’ choice is provided by the 
eigenvectors of H, namely | Tu) and | In). 


Given a magnetic field B = Bn, the first step is to find the eigenvectors | tn) and 
| In), and the next step is to expand the initial state |A),. ;,;,; in terms of these 
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vectors: 


|A) initia] = Gul Tn) + aal ln). 


Both of these steps were explored in previous sections. The key point, which we 
now introduce, is that the spin state at any later time is given by 


[AS ape | a) ages "| Un) (3.45) 


This is reminiscent of the formula used to predict the time-development of a wave 
packet in a harmonic oscillator (Book 1, Chapter 6). To see why it works, we first 
note that the exponential factors in Equation 3.45 become equal to 1 at t = 0, so 
|A) becomes equal to |A}initia] at t = 0, as it must. Secondly, we can show that 
the above expression for |A) satisfies Schrédinger’s equation. To establish this, 
we substitute Equation 3.45 into Equation 3.44. Substituting into the left-hand 
side gives 

. d| A) f -iEyt/h = —iEat/ħ 

ih = ih ((—iBu/h) aue] Ta) + (-iEa/A) age*"/¥| Ja) 

= Eyaye E*/’] Tn) + Eaage i7.” ln), (3.46) 


while substituting into the right-hand side gives 
flA) = Ba În) + Egage 7P] la), (3.47) 


because | În) and | |n) are eigenvectors of H with eigenvalues Fu and Ey. Since 
the right-hand sides of Equations 3.46 and 3.47 are equal, the left-hand sides must 
also be equal. Thus |A) satisfies both the initial condition and Schrédinger’s 
equation; it therefore describes the spin state of the particle at all times (so long as 
the particle remains undisturbed by measurements). 


Worked Example 3.4 Essential skill 

At time t = 0, a silver atom is prepared in the initial spin state Predicting the time-dependence 
|A) initia = | Tz). The atom has a spin gyromagnetic ratio %s < 0, and is ina of a spin state in a magnetic field 
uniform magnetic field B = Bez, so its Larmor frequency is w = —y; B. 


Predict the spin state of the atom at any later time t, and hence obtain (S_) 
as a function of time, expressing your answers in terms of the Larmor 
frequency. 


Solution 


The Hamiltonian matrix is H = —7,B g, = Ton Using Equations 3.24 
with 0 = 7/2 and ¢ = 0, we see that this matrix has eigenvectors 


i (fil 1 |—1 
2) = = and 2) = == ; 
Məsih) amd be) =] 
with the eigenvalues Ey = +hw/2 and Ey = —hw/2 shown in Figure 3.9a. 


We must expand the initial spin state in terms of these eigenvectors. To do 
this, we write 


|A) mitiad = | Tz) = Gul Te) + aal Le), 
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and then find the coefficients from 


2 aun TER 
a= (telt) =E lol ==. 
1 il 1 
ag = (Le | le) = Va [-1 1] o = 75 
This gives 
1 
Ani = 75 (I 12) ~| La)), 
so, for any time t, 
JA) = Se (eo 1) — 2M J.) 
1 


= a = Le etiwt/2] C) 


—iwt/2 |1 Pee = 
1 2 1 


I feme a] | e 
~ 2 |e~wt/2 — etiwt/2| ~ |—isin(wt/2)| 


| 
N| = 


Remember: e'” = cos x +isin z. 


The expectation value of S, at time t is then given by 

(Sz) = (4182| A) 

h R 1 O| | cos(wt/2 
=; [cos(wt/2) isin(wt/2)] [ | E o 


= Ë (cos? wt /2) — sin? (wt/2) 


= Ë sos(wt). 


Exercise 3.18 How does the expectation value of Sy vary with time in the state 
described in Worked Example 3.4? 


Exercise 3.19 A spin-4 particle with y, > 0 is in a uniform magnetic field that 
points in the y-direction. At time t = 0, it is in the spin state | 1+). Find the spinor 
that represents the state of the particle at any time t, expressing your answer in 
terms of the Larmor frequency. (The results of Exercise 3.15 may be useful.) E 
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Summary of Chapter 3 


Section 3.1 Any spin component of a spin-4 particle has two possible 

values, +h/2 and —h/2, corresponding to the two beams that emerge from a 
Stern—Gerlach magnet. The probability that an atom will be measured to be 
spin-up relative to one Stern—Gerlach magnet, when it has been prepared to be 
spin-up relative to another Stern—Gerlach magnet, is equal to cos?(0/2), where 0 
is the angle between the orientation vectors of the two magnets. 


Section 3.2 The spin state of a spin-5 particle is represented by a vector in spin 
space, which can be conveniently written as a two-element matrix called a spinor. 


The inner product of |A) = H and |B) = H is given by 
2 2 


b * * 
(ALB) = [aj a3] |p!) = aibi + ate 


As always, we have (A|B)* = (B|A). 
Section 3.3 The spin component in a given direction n is an observable quantity, 
represented in quantum mechanics by a 2 x 2 matrix 
a ħ | cos@ e'?sin@ 
S = id a; . 
e?sinf —cosé 


> 
This matrix has eigenvectors 
_ | cos(@/2) _ [>e sin(@/2) 
| In) = les ~ and fhas | cos(0/2) T 
and eigenvalues +h/2 and —h/2. The two eigenvectors form an orthonormal 


basis in spin space. They are the only states that have definite values of the spin 
component in the n-direction. 


Spin matrices along different Cartesian axes obey commutation relations similar 
to those for the components of orbital angular momentum. As a result, it is 
impossible for two (different and non-opposite) components of spin to have 

p 2 gZ go gz , 
definite values in the same state. The matrix S = S, + S, + S, commutes with 
all spin matrices and has the value s(s + 1)h? in any state, where s = 1/2. This is 
the origin of the term ‘spin-5 particle’. 
Section 3.4 Spin components are undetermined until they are measured. On 
measurement of Sy, the spin state vector collapses onto an eigenvector of Sn — 
the one whose eigenvalue is equal to the value obtained in the measurement. 


If a particle is in the spin state 
|A) = aul Tn) + aal tn), 


and the spin component in the n-direction is measured, the probability of getting 
the spin-up value +//2 is |ay|? = |(În |A)|?, and the probability of getting the 
spin-down value —h/2 is |aq|? = |({n |A)|?. Detailed calculations confirm the 
cos? (0/2) rule. 


The expectation value of Sn is given by the sandwich rule (Sn) = (A| Sn |A). 
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Section 3.5 A spin-5 particle in a magnetic field B = Bn has the Hamiltonian 
matrix H = —7,B Sn, where y, is the spin gyromagnetic ratio of the particle. 


The time-independent Schrödinger equation has two energy eigenvectors that are 
the eigenvectors of Sn: | Tn) and | |n), and two energy eigenvalues, +hw/2, 
where w = |ys|B is the Larmor frequency. The (time-dependent) Schrödinger 
equation is solved by expanding the initial spin state in terms of | În) and | |n). 


The time-dependent spin-state is then obtained by inserting appropriate factors of 
etot / 2 


Achievements from Chapter 3 


After studying this chapter, you should be able to: 
3.1 Explain the meanings of the newly defined (emboldened) terms and 
symbols, and use them appropriately. 


3.2 Describe the behaviour of spin-4 particles in various combinations of 
Stern—Gerlach magnets. 


3.3 Explain the roles of spin preparers and spin analyzers. Recall and use the 
cos? (0/2) rule. 


3.4 Write spin states in terms of ket vectors and spinors, using appropriate 
conventions. Explain what is meant by an orthonormal basis. 


3.5 Use spinors to evaluate inner products of vectors in spin space, and to 
normalize spinors. 


3.6 Given expressions for the general spin matrix and its eigenvectors, find 
specific forms appropriate in given circumstances. 


3.7 Find the probabilities of the possible outcomes of a given spin measurement 
in a given spin state. 


3.8 Find the expectation value of a given spin observable in a given spin state. 


3.9 Write down a Hamiltonian matrix that describes the interaction of a spin- 
particle with a uniform magnetic field; find the corresponding energy 
eigenvectors and eigenvalues. 


3.10 Use Schrödinger’s equation to predict the time-dependence of a spin state 
describing a spin-4 particle in a magnetic field. 
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Introduction: many-particle systems 


In this chapter, the systems we study become more realistic in one important 
respect: they involve more than one particle. It will not have escaped your 
attention that virtually all the systems we have studied so far consist of a single 
particle bound in a well, or tunnelling through a barrier, etc. But all the interesting 
systems we would like to describe, such as atoms, molecules, solids, etc., contain 
many particles. The main part of this chapter presents the quantum theory 

of systems of two particles. Although this might not seem like a great leap 
forward, the basic ideas are those required to describe systems with any number 
of particles. Two-particle systems also prepare the way for the remarkable 
consequences of quantum entanglement, the subject of the final chapters of this 
book. 


The quantum world differs from the classical world in a radical way that becomes 
apparent when we study systems of more than one particle. In the quantum world, 
particles of a given type (such as electrons, protons, helium atoms, etc.) are 
identical in a way that has no parallel in the everyday world. Manufacturers try 
hard to make all their white snooker balls identical, but they cannot succeed. All 
electrons on the other hand are absolutely identical, a fact that has profound 
consequences for the behaviour of matter; it lies behind the Pauli exclusion 
principle which will feature later in this chapter. The properties of atoms and, 

in fact, of all the ‘stuff’ that surrounds us, are crucially dependent upon this 
principle. That is why much of this chapter deals with the mathematical 
formalism for dealing with identical particles. Before that, however, we must start 
with the quantum mechanics of two distinguishable particles. 


Section 4.1 presents the basic quantum theory of a system of two distinguishable 
particles. We start with spatial wave functions of two particles and then show how 
to combine this with a representation of their spin to obtain a total wave function. 
Section 4.2 introduces a new and uniquely quantum feature: the fact that all 
particles of a specific kind are absolutely identical. It turns out that microscopic 
particles fall into two distinct categories: fermions and bosons. You will see that 
identical bosons must be described by a symmetric total wave function, and 
identical fermions by an antisymmetric total wave function. Finally, Section 4.3 
contrasts the behaviour of fermions and bosons in real systems. You will see how 
the fermion nature of electrons contributes to the rigidity of metals, and how the 
boson nature of some atoms leads to a remarkable quantum phenomenon called 
Bose-Einstein condensation. 


4.1 Systems of two distinguishable particles 


In previous chapters we have presented the quantum mechanics of a single 
particle, such as a particle in a box. Here, we take the first steps toward describing 
systems of more than one particle. We shall not solve Schrédinger’s equation for 
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To simplify notation, we write 
Da; AS pi. 
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particular systems, but rather extract some very important general properties of 
wave functions for any system of more than one particle. Many of these features 
already occur with systems of just two particles, and we focus on two-particle 
systems in this section. All the ideas extend straightforwardly to the general case 
of N particles. 


In this section we impose further temporary simplifications: we deal only with 
distinguishable particles and restrict ourselves to the case in which each particle 
is confined to one dimension, the x-axis. We also assume the particles do not 
interact with each other, although they may be subject to external forces. In fact, 
there are few physical situations in which particles do not interact. However, 
working with non-interacting particles greatly simplifies the points we wish to 
make in this chapter. Even in the cases where the particles do interact, the 
non-interacting case is a useful starting point for more elaborate calculations. 


4.1.1 Schrodinger’s equation and wave functions 


The system we shall initially consider has two distinguishable particles, 1 and 2, 
confined to the x-axis. The particles have masses mı and mə and coordinates xı 
and x2. The external forces acting on the particles are expressed in terms of 
potential energy functions Vj (#1) and V2(x2). The wave function describing the 
system depends on x; and 2 and on time t, and is written as V(x1, x2, t). It 
satisfies Schrédinger’s equation for the system in question. 


The first step in writing down Schrédinger’s equation is to find the classical 


Hamiltonian function. This is the sum of the two kinetic energies, written in terms 
of momenta, and the two potential energies: 


2 2 


H= AL + 52 + Vi(e1) + Vale). (4.1) 
mı 22mg 


Since the particles do not interact with each other there is no mutual potential 
energy term, V(x, — x2), and H is simply a sum of terms, Hı + Ho, associated 
with the individual particles. 


To convert the Hamiltonian function H into a Hamiltonian operator H, we replace 

the momenta of the two particles by momentum operators 
o 

Ox; i 

where the index 7 can be 1 or 2. With this substitution, the Hamiltonian function 

becomes the Hamiltonian operator: 


R2 fag h2 82 
3z t Vi (21) + V2(x2). (4.3) 
2 


pi = P; = —ih (4.2) 


H — i = 


2m, Ox? 2mə 
The wave function (2x1, x2, t) is then a solution of Schrédinger’s equation: 
Ow t 
ih (i; T2, ) 
ot 
Before Schrödinger’s equation can be solved, we need to know the Hamiltonian 


operator H for the specific system we are studying. The following exercise 
illustrates the process of finding H. 


= H U(x}, 22,1). (4.4) 


4.1 Systems of two distinguishable particles 


Exercise 4.1 A one-dimensional two-particle system consists of particle 1 of 
mass mı and particle 2 of mass mz. Particle 1 is subject to a force whose 
potential energy function is 5C1 24, and particle 2 is subject to a force whose 
potential energy function is 5 C203, where C4 and C% are force constants. Find 
the Hamiltonian function and Hamiltonian operator for this system. Write down 
the explicit form of Schrédinger’s equation for this system. E 


It is not difficult to write down Schrödinger’s equation for a specific system of 
more than one particle, but we still have to solve the equation and interpret its 
solutions. The first step is to look for stationary-state solutions, which are the 
product of two functions: 


(x1, £2,t) = (z1, £2) T(t). (4.5) 


These two functions are: 


e a time-dependent part, T(t) = e~i”“/", the solution of 
d 
ih T(t) = ET(t); (4.6) 


e a spatial eigenfunction, the solution of 
i o ir o 
2m, On 2m2 ôx? 


+ Vi(z1) + V2(x2)| Y(z1, 22) = E Y(£1, £2). (4.7) 


Equation 4.7 is the time-independent Schrödinger equation for the two-particle 
system with total energy Æ. Its solutions give the energy eigenfunctions and 
eigenvalues of the two-particle system. Before looking at the solutions in more 
detail, we shall pause to consider the meaning of the wave function Ẹ (z1, x2, t). 


For distinguishable particles, a two-particle wave function is interpreted using the 
following extension of Born’s rule: 


Born’s rule for two distinguishable particles 


The probability of finding particle 1 in a small interval dx, centred on x1 
and particle 2 in a small interval dx centred on x2 is given by 


|W (x1, £2, t)|? dx 6x9. (4.8) 


This probability depends on the coordinates of both particles. For Equation 4.8 to 
make sense, we must ensure that (2, x2, t) is normalized so that the probability 
of finding both particles somewhere in all space is equal to 1: 


Normalizing the wave function of a two-particle system 


f il |W (x1, £2, t)|? da, dag = 1. (4.9) 


99 


Chapter 4 Many-particle systems and indistinguishability 


Essential skill 


Evaluating the expectation value 
of an observable in a state 
described by a two-particle wave 
function 
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In this chapter we shall almost always deal with stationary state solutions. In this 
case the probability is time-independent since |W (x1, x2, t)|? = |a(x1, x2)|?, 
where (x1, £2) is the eigenfunction in Equation 4.7. Thus, for stationary states, 
(#1, £2) can replace U(x, x2, t) in Equation 4.9. 


Sometimes, we are interested in knowing the probability of finding particle i 
within a small interval ôx; regardless of where the other particle is. In this case 
we need to integrate over the coordinates of the other particle. For example, the 
probability of finding particle 1 (¢ = 1) in a small interval 6x1, centred on 1, 
regardless of the whereabouts of particle 2, is 


| EE dna) Oa. (4.10) 


Based on this, the following example shows how to calculate the expectation 
value of the position of one of the particles. 


Worked Example 4.1 


Consider a one-dimensional two-particle system whose wave function at 
t=Ois 


Tienes 
(x1, £2) = ( ) e771/2a7 e703 / 203 


71a} a2 


(a) Verify that the wave function is normalized. 


(b) Find the expectation value of x2, the coordinate of the second particle. 
(The integrals you need are inside the back cover of the book.) 


Solution 


(a) The wave function will be normalized if it satisfies Equation 4.9. We 
therefore need to evaluate 


i= iL i: w* (#1, £2) (x1, £2) dx, dx 


= ome e723 /43 dx, dx2 
TA14A2 


co 
= fan e” 1/94 dx, x / e 772/03 dz2. 
Ta {a2 J_ -e3 


Using a from the back cover, we obtain 


o X a = le 


= 


Ta a2 


(b) The expectation value of xə is given by the sandwich integral 


(z2) = T ie w* (£1, £2) £2 Y(£1, £2) dx; dre 


1 E R ae fee 
= ue 71/1 e 12/42 dri dao 
ONC J—oo J—co 
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1 Z 2/2 S 27,2 
= i enti dx, x | T3 e 72/0 dzə 


maian oo =e 


BS 2 [2 
x ayt X J x2 e 72/03 ca = 0. 
TA Q2 Boo 
The last integral is zero because the integrand is an odd function of x2, 
integrated over a range that is centred on x2 = 0. 


In general, the expectation value of any observable A in a system of two particles 
described by the wave function U (21, x2, t) is given by the sandwich integral 


(A) = f f W* (x1, £2, t) A U(2x1, £2, t) dx, dxo. (4.11) 


Exercise 4.2 For the wave function in Worked Example 4.1: 
(a) Find the expectation value of pı, the momentum of the first particle. 


(b) Find the expectation value of (x, — 22)”. What is the physical significance of 
this expectation value? a 


We now consider the energy eigenfunction w(x, x2) in more detail. In order to 
prepare the way for the case of identical particles, we shall specialize somewhat. 
We now assume that both particles are in the same potential energy well, V (x), so 
that particle 1 has potential energy V (1) and particle 2 has potential energy 
V (x2); however, we continue to assume there is no interaction between the 
particles (so there is no V (zı — x2) term). We assume that both particles have the 
same mass Mı = M2 = m, but that they are distinguishable by some other means. 
With these simplifications, Equation 4.7 becomes 
2 92 2 92 
- 7 a m2 t V (x1) + V (2x2) (x1, £2) = Ew(a1, £2). (4.12) 


We shall look for solutions that can be written in the product form 


(£1, £2) = Yn(21) We (a2), (4.13) 


where w,(x1) and 7;,(22) are single-particle energy eigenfunctions describing the 
states of particles 1 and 2, respectively. 


Introducing this form for Y(x1, 72) into Equation 4.12 gives 
hn o i? o 
| 2m ðr? 2m ðr? 


+V(01) + v(e») E S E E 


Now, 0?/0x? and V (x1) do not do anything to Yp(£2), and similarly 8? /ðx? and 
V (x2) do not affect Yn (x1). We can therefore re-write the above equation as 


2 2 
belea) -2 ga + VEn] vala) 
+ Unter) [A oy + Vle] valea) = B vale) vela). 
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Dividing both sides of this equation by Yn (£1) Yk(z2), we arrive at 


1 h2 6? 
V n 
Wn(1) | 2m Ox? + («| Vn(x1) 
1 h2 82 
K =E 
K wr (x2) | 2m ôx? + (e)| Wr (x2) 
In order for this equation to hold for all values of xı and x2, we require that 
5 VG) eae 
Wn(21) 2m Ox? Ti n\ T1) = En 
and 
1 h2 82 
=F 
r(22) | Im o v(e») pk(x2) = Ep, 


where En and Ex are constants satisfying En + Ek = E. These two equations 
can be rewritten as individual eigenvalue equations for each particle: 


h- a 
E Aye + v(e) Wal Li) = En Wn(21), (4.14) 

1 

i o 
E ant v(e») Yr(z2) = Ex prk(22). (4.15) 

2 


We now see that the subscripts n and k signify the nth and kth energy 
eigenfunctions and eigenvalues for a particle in the potential energy well V(x). 
Because Equations 4.14 and 4.15 have the same form, we can say that the two 
particles have the same set of energies in the well. 


The time-dependent phase factor that satisfies Equation 4.6 is 
T(t) 2 e`iEt/h — e Ent En )t/h 


where E is the total energy of the system. In an energy eigenstate of the 
two-particle system, each particle has a definite energy and the total energy of the 
system is the sum of the energies of the two particles; this is a consequence of 
having no interaction term V(x; — x2) in the Hamiltonian operator. The ground 
state (i.e. the state of lowest energy) is that in which both particles are in the 
single-particle state of lowest energy. The first excited state of the two-particle 
system is that in which one particle has this lowest energy, and the other has the 
next-to-lowest energy. 


In this chapter we focus on stationary states, which have time-independent 
probability densities. For this reason, we can calculate all probability 

densities directly from the eigenfunctions. For the product eigenfunction 

(#1, £2) = Yn(z1) We (x2), Born’s rule tells us that the probability of finding 
particle 1 in a small interval 621, centred on x1, and particle 2 in a small interval 
6x2, centred on x2, is 


| (a1,r2)|? dary z2 = |e (x1) |? barr  |be (x2)? 8x2. (4.16) 


This is reasonable: in a non-interacting system we would expect the probability 
densities for the two particles to be independent of one another, and independent 
probabilities are multiplied together. It is a general property of systems of 
non-interacting, distinguishable particles that the total probability density for the 
whole system is the product of independent probability densities for the individual 
particles. 


4.1 


Two particles in an infinite square well 


It is not easy to visualize w(a1, £2) or the probability density |(21, x2) |?, and the 
simplest way to represent them is with contour plots. The case we consider 

is that of two non-interacting distinguishable particles, 1 and 2, trapped in a 
one-dimensional infinite square well, described by the potential energy function 


Vis)= H 


where we take x = xj for particle 1 and x = xə for particle 2. From Book 1 
Chapter 3, we know the energy eigenfunctions for a single particle in such a 
well. Let us suppose that the first particle is in the lowest state, n = 1, with 
eigenfunction 


water) = [Fin (72), 


while the second particle is in the first excited state, n = 2, with eigenfunction 


forO < z< L, 


(4.17) 
elsewhere, 


(4.18) 


(4.19) 


Systems of two distinguishable particles 


Figure 4.1 Contour plots of 
the probability density as a 
function of the coordinates x; 
and x2 of two distinguishable 
particles trapped in a 
one-dimensional infinite square 
well of length L: (a) particle 1 in 
the ground state and particle 2 in 
the first excited state; (b) the two 
states reversed. In both panels, 
the bottom left-hand corner 
corresponds to 7; = £2 = 0. 


Now refer to Figure 4.1a. Below the main figure we show y (x1), and on the 
left-hand side we show ~2(x2), which has a node at z2 = L/2. The main figure is 
a contour plot of the probability density |y (x1) w2(x2)|?. This probability 
density is zero along the z2 = L/2 line, where y2(x2) has a node, and it has 
ridges along z2 = L/4 and x2 = 3L/4. The ridges are highest at xı = L/2. 
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Figure 4.1b shows a similar contour plot of the probability density 
\u2(a1) Y1 (z£2)|?, which describes the case where particle 1 is in the first excited 
state, n = 2, and particle 2 is in the lowest state, n = 1. 


The two parts of Figure 4.1 differ substantially from one another because of our 
assumption that the particles are distinguishable; it matters which particle is in 
which state. You will see that such distinctions disappear later in this chapter 
when we consider identical particles — in which case the probability densities are 
very different. 


Exercise 4.3 (a) For the case shown in Figure 4.1a, write down an explicit 
expression for the probability density of finding particle 1 in a small interval 
around x, and particle 2 in a small interval around x2. 


(b) Confirm that the probability density in part (a) reaches its maximum value at 
(x1, £2) = (L/2, L/4) and at (x1, 22) = (L/2,3L/4). | 


4.1.2 Including spin in the wave function 


So far, we have neglected the spin of the particles. However, many particles 
(including electrons, protons and neutrons) have spin. Here we consider how to 
take spin into account for two distinguishable spin-3 particles. 


First, we shall consider the description of spin for a single spin-3 particle. In the 
previous chapter, we specified spin states with respect to a variety of different 
reference directions (as in | Tx), | |y) or | Tn)). Here, however, we can adopt a 
simpler convention: all spin states will be specified with respect to the z-direction. 
There is no loss of generality because we can choose the z-axis to be in any 
direction we like. Bearing this convention in mind, we shall omit unnecessary 
subscripts by writing 


spin-up = | 1?) =|T.) and spin-down =| |) =| l2). 


There is an alternative way of specifying the spin state of a spin-5 particle, based 
on the quantum numbers s and ms. By definition, the spin quantum number of 

a spin-5 particle is s = 7 which means that, in any state, the square of the 
magnitude of the spin angular momentum is S? = s(s + 1)h? = 3h?/4. The spin 
magnetic quantum number, mg, is equal to +5. This defines the z-component of 
the spin angular momentum via the equation S, = m,h = +h/2. Thus we can 
specify a spin state by giving s and ms. We have 


=i m 2th) =|T) od Js= tm =-1)=] |): 


since we know that we are 
=’, so the spin states are 


We can abbreviate this notation by omitting s = 
dealing with spin-3 particles. We also omit the ‘ 
labelled unambiguously by 


|+3)=|1) =] 2) and |-5)=| 1) =| 12). 


Now let’s consider the description of spin for two distinguishable particles. To 
each spin ket, and to all spin observables and quantum numbers, we attach the 
subscript 7, where ¿ can be equal to 1 or 2 depending on the particle under 


1 
2? 
S 
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consideration. We then have the eigenvalue equations 


Sear ite (4.20) 
S ik= r | Le (4.21) 
Szil Ph = +h Das (4.22) 
Sail Di = -5A De (4.23) 


In our alternative notation, the last two equations can be combined to give 


~ 


Szi |3); = +343); (4.24) 


Just as the spatial eigenfunctions ¢)(x1, 72) = Yn(x1) Yk(£2) of a two-particle 
system are products of single-particle functions, so the spin state of the system 
can be expressed in terms of ‘products’ of spin kets. For example, if particle 1 
has spin up, and particle 2 has spin down, we can represent the spin of the 
two-particle system as 


| Ta l D2 = l+ |-3)o- (4.25) 


The meaning of this product will be discussed shortly; for the moment, you can 
simply regard it as a notation that indicates the spins of the particles, with the 
subscripts on the kets making it clear which particle has which spin. It is also 
possible to indicate the spins of the particles using a single ket, for example, as 


+4 -4 =I 11), (4.26) 


or more generally as |Ms; , Ms,), but here a very important convention applies: the 
first entry in the ket always refers to particle 1 and the second entry to particle 2. 
We call this the positional convention for spin states. It means that | ||) does not 
represent the same spin state as | |{). 


Exercise 4.4 Express the following spin states in |ms,,7™s,) form: 


(a) | 1). 1 To) | Lr | Lo a 


The meaning of an expression like | {), | |). becomes clearer when we consider 
the effect that a spin operator has on it. The rule is that any spin operator for 
particle 1 acts only on the spin ket of particle 1, and any spin operator for 
particle 2 acts only on the spin ket of particle 2. We therefore have 


Seal Dal Da = (Berl Ma) | Ye = HAI 11 | Da 
821 Da l Lo = 1 1) (8221 12) = -A1 Ti | os 


and we can add these two equations to give 


(S21 +822) | T)1| Ly = 0. (4.27) 


The operator on the left-hand side represents the total z-component of spin for the 
system. This acts on | }), | |). to give zero, as you might expect for a two-particle 
system in which one particle has spin up and the other has spin down. 


The final step is to combine information about the coordinates and spin of the 
particles. This is done by introducing a quantity called the total wave function of 
the system, Yj. The total wave function is written as a product of the spatial 
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The argument leading up to 
Equation 4.15 showed 

that (21) Ye(x2) is 

an eigenfunction of H with 
eigenvalues (En + Ex). 


Essential skill 


Applying operators to total wave 
functions 
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wave function Y (x1, 72,t) and the spin ket |Ms; , Ms, ) that describes the spin 
state of the system. Thus 


V1 2(t) = V(x, T2, t) [Ms , Msg )- (4.28) 


This chapter deals only with stationary states, and is not concerned with 
time-development. We shall therefore simplify matters by setting t = 0, at which 
time the stationary-state wave function Ọ (x1, x2, t) is equal to the corresponding 
energy eigenfunction (£1, £2). We therefore consider 


01,2 = U(21, £2) [Msi ; Mss), (4.29) 
and will also call this the total wave function. 


Total wave functions can be acted on by any type of operator. Spin operators act 
only on the spin part, while operators involving coordinates or momenta act only 
on the spatial part. For example, consider a state described by the total wave 
function 


pı 2 = Ynlz1) Ve(@2) 17), | Lo (4.30) 


in a system where the Hamiltonian operator is that in Equation 4.12: 


i. 6? i? g 
z + V(x1) + V (z2). 


fi = 
2m, Ox? 2m O25 


This operator does not contain any spin matrices, so it acts only on the spatial part 
of the wave function. We obtain 


P yi2 =H dn(21) pelra) | Ta | Lo 
= (Avn(o1) ve(w2)) l Dil Lo 


= ((En + Ex) Ya (21) Ux (x2)) I Dal d2 
= (En + Ex)y1,2. 


So the total wave function 71,2 is an eigenfunction of fi, with eigenvalue 
En + Ex; this is interpreted as the total energy of the system. 


Worked Example 4.2 
A system of two distinguishable particles has the total wave function 
01,2 = Un(1) Ve(@2) | Pal Tes 


where Yn(zx1) and Yk(z2) are the eigenfunctions in Equations 4.14 and 4.15. 
Show that this total wave function is an eigenfunction of S74, and find the 
corresponding eigenvalue. 

Solution 


Applying S., to the total wave function, we note that Yn (x1) and u% (22) do 
not depend on spin, so the spin operator does not ‘see’ them. Hence 


S21 U1,2 = Yn(z1) Ye (a2) S21 | Dal To: 


4.2 Identical particles 


In addition, S21 does not act on the spin ket of particle 2, so 
821 di = vn (z1) Pelra) (8211 1) | No 


= bn (01) belz) (+I D) | Ne 
= +4ħ 2. 


Hence 7,2 is an eigenfunction of the spin operator S,;, with 
eigenvalue +$h. 


In general, the total wave function gives a complete description of the 
system, including spin. 


Exercise 4.5 (a) Is | ?Î) an eigenvector of the operator Ga + S22)? 
(b) Is| Tl) +]| LT) an eigenvector of (S.4 +822)? 
(c) Is |11) — | 1f) an eigenvector of Ga + S22)? | 


Figure 4.2 (a) and (b) A 
collision between two seemingly 
identical particles in classical 
physics. The trajectories can be 
followed at all times: two 
different cases are shown, in 
which different particles enter 
detector X. (c) A collision 
between two identical particles 
in quantum physics. The 
trajectories cannot be followed 
and it is impossible to say which 
particle enters detector X. 


F Nava 5 es N ~ 
~ o > C X 


(a) (b) (c) 


4.2 Identical particles 


We now encounter a profound difference between the world of everyday things 
and the quantum world: the existence of identical particles. However similar two 
ball bearings may appear, they are not identical. It is effectively impossible for 
two individual balls, each made of a vast number of atoms, to be identical. By 
contrast, all electrons are absolutely identical, as are all protons and all alpha 
particles. Two particles are said to be identical if all their unalterable attributes 
(such as charge and mass) are the same. The difference between ‘extremely 
similar’ (as snooker balls may be) and ‘identical’, as electrons are, is profound, 
and is a crucial ingredient in determining the nature of the world. 


Let us explore the vital difference between similar particles in classical physics 
and identical particles in quantum physics. Figure 4.2 shows the collision between 
two seemingly identical balls in classical physics. In Figure 4.2a, the ball 
originally in the top left-hand corner reaches detector X. In Figure 4.2b, the ball 
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log(seattering probability) 


Figure 4.3 The scattering probability of 
two particles as a function of their scattering 


originally in the bottom right-hand corner reaches detector X. We have no 
doubt about which of these outcomes has occurred because we can follow 
the trajectories of the balls. We could also predict, on the basis of Newtonian 
mechanics, which ball will end up in which detector. 


A very different situation applies in quantum physics. In 
quantum physics, particles cannot be said to follow well-defined 
trajectories because the uncertainty principle prevents us from 
knowing both the position and the momentum of a particle at any 
given time. If two identical particles originate in well-separated 
regions, they may initially be distinguished from one another 
by virtue of their locations, but as soon as their wave functions 
overlap, we lose track of which particle is which, and the 
particles become indistinguishable. This idea is represented in 
Figure 4.2c where, instead of thin lines representing well-defined 
trajectories, we use fuzzy regions because the positions of 

the particles are not well-defined. In quantum physics we 
cannot say which of the particles enters detector X because 
there is no way to distinguish between the particles once they 
have arrived, and the trajectories of the particles are ill-defined. 
All we can say is that a particle has arrived at this detector. 
There are two ways in which this might happen, since either 

of the particles could have triggered the detector. When a 

given event can happen in two different ways, the possibility 

of interference arises; this is what happens when a particle 

can reach a screen via two different slits, for example. A similar 


L L 
10 30 


L | L I 
50 70 90 116 1 
6/ degrees 


30 150 17 thing happens here. When beams of identical particles scatter 
from one another, the intensity of scattered particles displays 
peaks and troughs when plotted as a function of the scattering 
angle (Figure 4.3). Such an interference pattern is not found 
for distinguishable particles, so the identity of particles makes 
a profound difference; it leads to completely new phenomena! 


angle, 0. The solid line is for two identical alpha 


particles, and shows the peaks and troughs 
characteristic of an interference pattern; the 
dashed line shows what would be observed for 
two similar but distinguishable particles. 
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We cannot give all the details at this stage, but it may be useful 
to survey the main ideas, so that you can see the route that lies 
ahead. When we describe a system of two identical particles in 
quantum mechanics, we generally start by labelling the particles: 
1 and 2. This may seem contradictory, but it is necessary for 
book-keeping purposes, as it provides a consistent way of 
indicating ‘this particle’ and ‘the other particle’. However, the whole point about 
having a system of identical particles in quantum mechanics is that we are unable 
to distinguish between the particles in any way. Therefore alternative ways of 
labelling the particles (with the labels swapped over) must describe exactly the 
same state. To pretend otherwise would be to pretend that we know more about 
the system than is knowable, which is against the rules in quantum mechanics — 
comparable to pretending that we know which slit an electron passes through in a 
two-slit interference experiment. 


A full description of the state of a two-particle system in quantum mechanics is 
provided by the total wave function, 1,2. If we label the particles the other way 
around, this becomes p2 ı which must describe precisely the same state as 7 9. 
Remembering that a wave function can be multiplied by an arbitrary phase factor 


4.2 Identical particles 


et% without changing the state being described, we can say that q2, ı must be equal 
to %1,2, possibly multiplied by a phase factor. 


@ Does the total wave function of Equation 4.30 provide an adequate 
description of a system containing two identical particles? 


O No, this total wave function will not do. We have 


1,2 = Wn(21) Prlz) | Mal Le, 


so reversing the labelling gives 


p21 = Wn(X2) Ye(x1) | Te | Da: 


Clearly %2,1 is not equal to a phase factor times 7,2, so neither 1,2 nor w21 
gives a correct description of a system containing two identical particles. 


It turns out that Nature allows only two possibilities. The first possibility is that 
swapping the particle labels leaves the total wave function unchanged; in this case 
W21 = %1,2, and the total wave function is said to be symmetric. The alternative is 
that swapping the particle labels reverses the sign of the total wave function; in 
this case %21 = —w1,2, and the total wave function is said to be antisymmetric. 


How do we know which of these possibilities applies in any given case? It turns 
out that all the particles in Nature fall into two categories, called bosons and 
fermions. For example, alpha particles are bosons, and electrons are fermions. 
The basic rule can then be stated as follows: systems of identical bosons have 
symmetric total wave functions, while systems of identical fermions have 
antisymmetric total wave functions. 


We must now fill in the details of constructing symmetric and antisymmetric total 
wave functions. The total wave function is a product of a spatial wave part and a 
spin part, and we consider each of these parts in turn. The next two subsections 
will do this, before putting everything back together to arrive at satisfactory total 
wave functions, suitable for systems of identical bosons or fermions. 


4.2.1 The spatial wave function for identical particles 


In this section we examine the spatial wave function of a pair of identical particles 
of mass mı = mz = m. As before, we assume that both particles are in the same 
potential energy well, V (x), and that they do not interact with one another 

(V (a1 — £2) = 0). The Hamiltonian operator is then 

Ro o i g 


z + V (ai) + V (z2). (4.31) 


f= 
2m Ox? 2m oxi 


Following the discussion in Section 4.1, we will assume that the spatial energy 
eigenfunctions of the Hamiltonian operator have the form 


y(x, x2) = Wn(z1) WPr(z2), (4.32) 


where Yn(x) and Yp(x) are the nth and kth energy eigenfunctions of the 
single-particle Hamiltonian operator 


Hsingle = Sa Ir V(x). (4.33) 
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Any function f (21, x2) 

of two variables is said 

to be symmetric if 

f (x1, £2) = +f (x2, £1) and is 
said to be antisymmetric if 


f (21,22) = — f (£2, 21). 


Note that the principle of 
superposition always applies to 
the full Schrödinger equation. It 
applies to the time-independent 
Schrödinger equation only for 
eigenfunctions with the same 
eigenvalue. 
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The product function in Equation 4.32 describes a state in which the particle 
labelled 1 is in state n and the particle labelled 2 is in state k. We have labelled 
the particles, 1 and 2, since this is needed to write down explicit expressions for H 
and (21, £2), but we do not expect any measurable property to depend on these 
labels. We therefore insist that the eigenfunction with the labels reversed, 


(x2, 21) = Wn(£2) ve(21), 


describes exactly the same state of the system. Both functions are equally valid 
eigenfunctions of the Hamiltonian operator in Equation 4.31, and both have the 
same energy eigenvalue, E = E, + Ex, the total energy of the two-particle 
system. We therefore have no reason to prefer Equation 4.32 over Equation 4.34. 


(4.34) 


If the particles are identical, the probability density (linked to what we can 
measure) must have the same values for all xı and x2, regardless of whether 
particle 1 is in state n and particle 2 in state k or the other way round. Therefore 
the indistinguishability of the particles imposes the following condition on the 
spatial functions: 


|b(21, x2)? = |b(a2, 21)|?, (4.35) 


that is, the probability density does not change when the particle labels are 
exchanged. This requirement is fulfilled if 


(21,22) = EY (x2, 21). (4.36) 


The plus sign corresponds to a symmetric eigenfunction, and the minus sign 
corresponds to an antisymmetric eigenfunction. However, a simple product of 
one-particle eigenfunctions is neither symmetric nor antisymmetric since, in 
general, 


Un(21) We (L2) A EVn(r2) Yk(z1). 


To obtain satisfactory eigenfunctions, suitable for describing a system of two 
identical particles, we shall make use of the facts that q(x, 72) and W(x2, 21) 
have the same energy and that H is a linear operator. In general, if Y% A and wp are 
two eigenfunctions of H with the same e energy eigenvalue E (so that Hu A= Eya 
and fiy B = Ep), then the fact that H is a linear operator guarantees that 


Flapa + bwp) = aH, + bHiyg = aEW, + DEvp = E(avy + btp), 


for any constants a and b. In other words: any linear superposition of two 
solutions of the time-independent Schrédinger equation with the same energy 
eigenvalue E is also a solution with eigenvalue Æ. This is the principle of 
superposition for the time-independent Schrödinger equation. Applying this 
result to (21, £2) and Y(z2, 21), which do have the same energy eigenvalue, we 
can write two alternative linear combinations of products, 


Salon) Yalea) 
which both satisfy the condition stated in Equation 4.35. The + signs 

on the left of Equation 4.37 match the + signs on the right: 7 obeys 

wt (a1, £2) = +Y (x2, 21), while ~~ obeys Y~ (x1, 22) = —Y (2, 21). The 
constant 1/ v2 normalizes the eigenfunction, provided that Yn and pọ are 
themselves normalized, as you can verify in Exercise 4.6 below. 


(21,02) = + (21) Yn(x2)]; (4.37) 


4.2 Identical particles 


Worked Example 4.3 Essential skill 
Prove that the eigenfunction ~~ in Equation 4.37 satisfies the condition Verifying that a spatial function 
stated in Equation 4.35. is symmetric or antisymmetric 


with respect to particle exchange 
Solution 


To prove this, we only need to prove Equation 4.36. We know that 


= [Wn (21) Ye(e2) — be(21) Yn(22)], 


so what we need to do is to write out Y7 (2, x1) explicitly. 


w (1,22) = 


Exchanging the particle labels: 


= [Yn (x2) pplz) — Wr (@2) Vn(21)| s 


A simple rearrangement then gives 


Y~ (z2, £1) = 


eS =l- sbi (222) U(r) + Pn (2) vel] 
= =Q (Wi £2), 


so the condition |7)(x1, x2)|? = |¢b(x2, x1)|? is satisfied. 


The above worked example reveals that Y~ (21, £2) changes sign when the 
particle labels are exchanged. A similar argument shows that 7+ (21, £2) remains 
unchanged when the particle labels are exchanged. Both y* and ~~ are 
satisfactory energy eigenfunctions for a system of two identical particles; the 
former is symmetric and the latter antisymmetric. 


Exercise 4.6 (a) Verify that wt (x1, x2) given by Equation 4.37 is a 
symmetric function. 


(b) Show that if Yn (x) and y,(x) are two different normalized orthogonal 
single-particle eigenfunctions, then the two-particle eigenfunction 7" (x1, x2) is 
itself normalized, so that the total probability of there being two particles in the 
whole of space is unity. 


Exercise 4.7 Show that the antisymmetric two-particle eigenfunction given by 
Equation 4.37 is everywhere equal to zero if y,(x) = W(x). E 


Exercise 4.7 tells us something important: that we cannot find an antisymmetric 
eigenfunction corresponding to both particles being in the same spatial state. 
This is something that has profound consequences, as you will see later. By 
contrast, we can construct the symmetric eigenfunctions Yn(x1) Yn(£2) and 
Wr(£1) Yk(x2). These two product eigenfunctions, together with the symmetric 
and antisymmetric combinations a [Un(x1) e(22) + Ve(21) Yn(e2)], are the 
only possible symmetric or antisymmetric eigenfunctions that can be constructed 
from the single-particle eigenfunctions Yn and qx. 


The following exercise concerns two identical particles in a harmonic oscillator 
potential energy well. As in our previous discussion, there is no interaction 
between the particles, so the total Hamiltonian operator H is the sum of the 
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Hamiltonian operators Hy + Hy for particles 1 and 2. This would not be the case 
if the particles interacted with each other, in which case there would be an 
additional potential energy term V (a1 — x2). 


Exercise 4.8 (a) Find the symmetric and antisymmetric two-particle spatial 
energy eigenfunctions Y® (x1, £2) for two identical non-interacting particles of 
mass m, one in the ground state and the other in the first excited state of a 
one-dimensional harmonic oscillator. The first two single-particle harmonic 
oscillator eigenfunctions and eigenvalues are 


1, \ 2/202 1 
wl) = (7) Ena rae Eo = 5ħwo, 


p(x) = 1”? met = 3hw 
11T) = 2an ü , L= 0; 


where a = „/ħ/mwg is the length parameter of the oscillator, and wọ is the 
classical angular frequency. 


(b) Confirm that the functions ~*(x 1, x2) found in part (a) are eigenfunctions of 
the Hamiltonian operator describing the two-particle system, and find their 
eigenvalues. a 


4.2.2 Spin states of a system of two electrons 


We now consider the spin state of a pair of identical particles. Different 
descriptions apply depending on the types of particle involved, but we shall deal 
only with spin-5 particles, such as electrons. This is the most important case for 
most applications, given the central role that electrons play in atoms, molecules, 
metals and so on. 


We know that the total wave function must be symmetric or antisymmetric, which 
implies that the space and spin parts must also be symmetric or antisymmetric. 
We have seen how to construct symmetric and antisymmetric spatial functions, 
wW~ (x1, £2). We shall now construct symmetric and antisymmetric combinations 
of spin kets. 


Our starting point is the set of four possible product spin kets: 


| TT) = | al Tas | Ll) = | Dal Da 
IT =[ Dalle IIPS] Dail Da 


The two entries in the first row remain unchanged when the labels 1 and 2 are 
swapped, so they are symmetric. The entries in the second row are changed 
by swapping the labels, and they are neither symmetric nor antisymmetric. 
However, adding and subtracting them produces symmetric and antisymmetric 
combinations. The complete set of symmetric and antisymmetric two-particle 
spin kets is: 


1 
symmetric: 


MTOE: Tte D, (4.38) 


al 


2 


1 
antisymmetric: 


5 ( ty -| 11), (4.39) 


I 


4.2 Identical particles 


where 1/,/2 is a normalization factor. Following the method of Worked 
Example 4.3, you can easily check that the ‘+’ combination is symmetric with 
respect to particle exchange, and the ‘—’ combination is antisymmetric. 


Exercise 4.9 Verify that the three spin kets in 4.38 are symmetric with respect 
to swapping the labels of the particles. | 


The four symmetric and antisymmetric spin states for two spin-4 particles 

listed in Equations 4.38 and 4.39 are the main output of this subsection. They 
play the same role for the spin part of the wave function as the symmetric and 
antisymmetric eigenfunctions did for the spatial part. Before putting the space and 
spin parts together to form the total wave function, we shall characterize the 
properties of these spin states more fully. 


In the same way that we constructed the Hamiltonian operator for a system of two 
particles in Section 4.1, we can introduce operators for the total spin of the 
system. We define 


S=8,+8, and $S, = 98 +82. (4.40) 


In both cases, the operator for the whole system is the sum of the corresponding 
operators for the two particles. Notice, however, that it follows from the definition 
S R2 Z2 Aa aSa æ 
of S that the operator for the square of the total spin is S = S4 + 2S1 + S2 + Sə, 

a2 a2 
which is not the same as S; + So. 


Now it is easy to show that the symmetric and antisymmetric spin kets in Capital letters are used to denote 
Equations 4.38 and 4.39 are all eigenvectors of the spin operator S., and that their S and Ms, the total spin and 
eigenvalues can be expressed as Mgh, where Ms = 1 for | TT), Ms = 0 for total spin magnetic quantum 
a (| T=] ID); and Ms = —1 for | ||). These results are demonstrated in numbers for the two-particle 
Exercise 4.5 and in Exercise 4.10 below. system; this is to distinguish 


them from s and m, which refer 


The four spin kets in Equations 4.38 and 4.39 also turn out to be eigenvectors to single particles. 


a2 
of S with eigenvalues S(S + 1)h?, where S = 1 for the three symmetric 
states and S = 0 for the antisymmetric state. They are therefore simultaneous 


: a2 a a ae a 
eigenvectors of S and S_; this is allowed because the operators S and S, can be 
shown to commute with one another — a detail we shall take on trust to avoid a 
lengthy proof. 


These facts give us an alternative way of describing the spin states of a pair of 
identical spin-3 particles such as electrons. We introduce the notation |S, Ms) for 


simultaneous eigenvectors of the total spin operators a and S., where 

3” |S, Ms) = S(S + 1)}? |S, Ms) (4.41) 
and 

S; |S, Ms) = Msh|S, Ms). (4.42) 


Then we can classify the four symmetric and antisymmetric spin states as follows. 
First we have the symmetric spin states. There are three of these, so they are 
called triplet states: 
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Triplet spin states for two identical spin-5 particles 
11) =11, 2), 


1 
El Tl) +| 11) = 11, 0), 


LUD = i, =i): 


In the kets on the right-hand sides, the first number is the value of S' and the 
second number is the value of Ms. For example, |1,1) is |S = 1, Ms = 1), 
written in full, and so on. 


There is also an antisymmetric spin state. There is only one of these, so it is called 
the singlet state: 


Singlet spin state for two identical spin-4 particles 


=a TL) — | L1)) = [0, 0). 


Here |0,0) is |S = 0, Ms = 0), written in full. We sometimes say that electron 
spins in the singlet state are opposite but, strictly speaking, it is not correct to 
think of either of the electrons as ‘having’ a definite value of S, since we cannot 
say which electron is ‘spin up’ and which is ‘spin down’. This is a point we shall 
refer to again when we discuss entanglement in the next chapter. 


We emphasize that the |S, Ms) notation is just an alternative way of specifying 
the symmetric and antisymmetric spin states that are needed to describe particles 
like electrons. When using this notation, you must remember that the symmetric 
(triplet) states have S = 1, while the antisymmetric (singlet) state has S = 0. 


Exercise 4. 10 Show that | ||) is an eigenvector of the total spin operator 
S. =5 zi+ S z2. Find the eigenvalue of S, and the quantum number Msg. a 


4.2.3 Putting the spatial and spin functions together 


We now have all the mired ate needed for constructing the total wave function of 
a system of two identical spin-4 particles. Section 4.2.1 gave us symmetric 

and antisymmetric energy eigenfunctions, which are functions of the particle 
coordinates. Section 4.2.2 gave us symmetric and antisymmetric spin ket vectors 
describing the spin states of two identical spin-5 particles. We are less concerned 
here with other types of particle (with, say, spin-1) but, for completeness, we note 
that their spin states can also be represented by appropriate |S, Mg) ket vectors. 


Now, the total wave function of the system (depending on the spatial coordinates 
and spin) can be written as a product of a spatial eigenfunction depending on 
particle coordinates and a spin ket vector: 


1,2 = p(z, £2) |S, Ms). 


4.2 Identical particles 


As we Stated earlier, this total wave function must be either symmetric or 
antisymmetric with respect to exchange of the particle labels, 1 and 2. There are 
four ways in which this can be achieved: 


e spatial symmetric x spin symmetric = total symmetric, 

e spatial antisymmetric x spin antisymmetric = total symmetric, 
e spatial symmetric x spin antisymmetric = total antisymmetric, 
e spatial antisymmetric x spin symmetric = total antisymmetric. 


The first two ways produce a symmetric total wave function, and the last two ways 
produce an antisymmetric total wave function. 


In the introduction to Section 4.2, we mentioned that all particles can be classified 
as either bosons or fermions. Systems composed of identical bosons are described 
by symmetric total wave functions, while systems composed of identical fermions 
are described by antisymmetric total wave functions. Electrons are spin-5 
fermions, so the total wave function describing a pair of electrons takes one of the 
following antisymmetric forms. 


Triplet states: 
W (x1, %2)|1,1) = ler) valea) — pkl) Yn(22)] | 11) 


Y (a1,22)]1,0) = 5 [balen Yela) — veler) Walea] (| TL) + 141) 
Y- (#1,42)|1,—1) = Ze Valer) Valea) — Ylen) Valea)] | 1) 


Singlet state: 
Y (21, 22)10,0) = 5 [balen Yela) + vele) Walea] (| TL) ~ | 1) 


It is also worth noting that identical spinless particles are counted as being bosons 
with symmetric spin states, so the first possibility listed above is always realized 
for them, with a symmetric spatial wave function. 


Exercise 4.1! Two identical non-interacting spinless bosons occupy the 
single-particle energy eigenfunctions y,(x) and w(x). Write down an 
appropriate two-particle spatial eigenfunction describing this pair of bosons. 


Exercise 4.12 Construct the four possible total wave functions for the first 

excited level of two identical, non-interacting, spin-3 fermions of mass m in the 
one-dimensional harmonic oscillator well of Exercise 4.8. What are the energies 
of these two-particle states? a 


4.2.4 Fermions and bosons 


Although we have touched on the point previously, we are now ready to explore 
the crucial fact that all particles can be classified as being either bosons or 
fermions. How do we know whether a given particle is a boson or a fermion? The 
following definition makes the classification straightforward. 
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We say a particle has spin-2 if 
its spin quantum number s = zx. 


Table 4.1 


Some bosons and 


: A 
fermions. The symbol 7X 


represents an atom or nucleus 


of element X with atomic 


number Z and mass number A; 
Z is the number of protons and 
A is the total number of protons 


plus neutrons. 


Bosons Fermions 
photon electron 
pion proton 
deuteron, 7H neutron 

$He atom 3He atom 
8°Rb atom S°Rb nucleus 
alpha particle all quarks 


33Na atom 
No molecule 
O» molecule 


23Na nucleus 
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Bosons and fermions 
Bosons are particles with integer spin (s = 0,1,2,...). 
Fermions are particles with half-integer spin (s = 1/2,3/2,5/2,...). 


For example, we know that an electron has spin-3 so it is a fermion, and so are 
other spin-5 particles such as protons and neutrons. Photons (light quanta) have 
spin 1, and alpha particles (He nuclei) have zero spin, so these are both bosons. 
More bosons and fermions are listed in Table 4.1. You will see from the table that 
a nucleus that is composed of fermions may be a boson, for example $He. The 
general rule (established by Ehrenfest and Oppenheimer in 1930) is that a 
composite particle containing an odd number of fermions is a fermion, while one 
containing an even number of fermions is a boson. For example, a neutral 8°Rb 
atom is a boson, formed from 37 protons, 85 — 37 = 48 neutrons and 37 electrons 
(122 fermions in all). Composite particles composed exclusively of bosons are 
bosons. 


Exercise 4.13 Classify the following as fermions or bosons: a hydrogen atom; 
a deuterium atom (with 2H nucleus); a singly-ionized helium atom (with 3He 


nucleus); a 785U nucleus; a 733U nucleus; a 733U atom. a 


When we consider collections of identical particles, it is the boson or fermion 
nature of the particles that determines the symmetry of the total wave function. 


The symmetry of the total wave function 


The total wave function describing any system of identical bosons is 
symmetric with respect to particle exchange. 


The total wave function describing any system of identical fermions is 
antisymmetric with respect to particle exchange. 


The first question that we need to consider when writing the total wave function 
for a system of identical particles is: are the particles bosons or fermions? If they 
are bosons, the total wave function must be symmetric; if they are fermions, it 
must be antisymmetric. You have seen this principle used for pairs of particles, 
but it is true for systems containing any number of particles. The link between the 
boson or fermion nature of particles and symmetric or antisymmetric total wave 
functions can be regarded as a fundamental fact about the world we live in. It has 
actually been derived from other principles, including special relativity, but the 
proof (published by Wolfgang Pauli in 1940) is far beyond the scope of this 
course. 


A dramatic difference between bosons and fermions appears if we consider a state 
in which all the particles have the same z-component of spin, so that the spin part 
of the total wave function is symmetric. If the particles are identical bosons, the 
total wave function is symmetric, so the spatial part of the wave function must 
also be symmetric. This does not prevent the particles from all being in the same 
single-particle spatial state, since the product function Yn (x1) Yn(x2)... is a 
perfectly valid normalized symmetric function. There is no restriction on the 


4.2 Identical particles 


number of bosons, with the same spin, that can occupy a given spatial state 
represented by (a). In particular, it is possible for all particles in a system of 
identical bosons to be in the lowest single-particle energy level. 


The situation for identical fermions is profoundly different. If two fermions have 
the same z-component of spin, they must be in one of the symmetric spin states 

| Tt) or | ||). The total wave function of two identical fermions is antisymmetric, 
so the spatial part of the wave function must take the antisymmetric form 

Y (z1, £2) = Z [Yn(z1) Ye (22) — Vrl(z1) Yn(z2)], where n and k denote the 
spatial states occupied by the particles. Now, if we set n = k, we see that the 
spatial part of the wave function vanishes everywhere, from which we conclude 
that two fermions with the same z-component of spin, S,, cannot be in the same 
spatial state. Electrons only have two different values of S,, so the number of 
electrons in a given spatial state is limited to two. This is the origin of the 
‘classically non-describable two-valuedness’ discovered by Pauli and mentioned 
in Section 3.1.2 of Chapter 3. 


Figure 4.4 Satyendranath 
This restriction on the number of electrons per spatial state has important Bose (1894-1974), after whom 
consequences for systems containing many electrons. If we imagine the electrons bosons are named. 

being added progressively to the system, they cannot all go into the lowest 

single-particle energy level, but are forced into higher and higher energy levels. 

We will return to this point in the final section of this chapter. 


Finally, we mention the reason for the names ‘boson’ and ‘fermion’. Bosons are 
named after the Indian physicist Satyendranath Bose (Figure 4.4), and fermions 
after the Italian physicist Enrico Fermi (Figure 4.5). The discovery of this way of 
categorizing particles had a complicated history involving many people, including 
Einstein and Dirac. It involved evidence coming from the thermal behaviour of 
macroscopic systems containing vast numbers of particles. The way that particles 
tend to occupy the various energy levels of macroscopic systems at different 
temperatures is described in the branch of physics known as statistical mechanics. 
Bose found rules appropriate for photons and Einstein generalized them to atoms 
and molecules that are bosons, producing what is known as Bose-Einstein 
Statistics. Fermi and Dirac presented the corresponding rules for fermions, 
leading to Fermi—Dirac statistics. We shall not concern ourselves with the details 
here, but it is interesting to note that many properties of matter can be explained 
by combining the quantum-mechanical concepts of bosons and fermions with the 
classical idea of temperature. Figure 4.5 Enrico Fermi 


(1901-1954), after whom 
Exercise 4.14 The two protons within a helium nucleus have the same spatial fermions are named. Fermi 


eigenfunction. What is the total spin of the two-proton state in this nucleus? Ml received the Nobel prize for 


physics in 1937. 


4.2.5 Spatial distribution of bosons and fermions 


The indistinguishability of identical particles in quantum mechanics gives rise to a 
remarkable effect for which there is no classical analogue. We return to the case 
illustrated for distinguishable particles in Figure 4.1 — the probability density for 
a system of two particles in a one-dimensional infinite square well of width L, 
with one particle in the ground state and the other in the first excited state. 


We now consider the same situation except that the particles will be taken to be 
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Y= is the symmetrical (+) or 
antisymmetrical (—) spatial 
wave function for the system. 
Notice the subtle difference in 
wording between this box and 


the one containing Equation 4.8. 


identical. The single-particle eigenfunctions that we need are, as before, 


p(z) = 2sm (=) and u(x) = i/2sin (=). 


The appropriate symmetric and antisymmetric combinations are 


+ (1,2) = 5 [vhs (21) Y2(22) + y2(21) vr (22)], 
Ferns 5 [y1 (21) Y2(22) — polar) Yı (22)]. 


These are possible spatial wave functions for pairs of identical particles. 


Born’s rule continues to apply to identical particles, but now in the following 
form: 


Born’s rule for two identical particles 


The probability of finding one particle in a small interval 6, centred on x1 
and the other particle in a small interval 6x2 centred on x2 is given by 


| W= (21, 22, t)|? 0X41 622. 


Contour plots of |Y (x1, 22)|? are shown in Figure 4.6. They look nothing like 
the unsymmetrized probability densities of Figure 4.1. 
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(b) 


Figure 4.6 The probability density for two non-interacting identical particles in 
an infinite square well: (a) symmetric spatial wave function; (b) antisymmetric 
spatial wave function. In both panels, the bottom left-hand corner has coordinates 
zı = T? = 0. 


4.3 Consequences of particle indistinguishability 


For the symmetric spatial wave function (Figure 4.6a), the probability density has 
two maxima (at zı = x2 = L/4) lying on the x; = x2 line. As we move away 
from this line, the probability density decreases. It is as if the particles were 
mysteriously attracted to one another, although there is no force between them. 
We might say that the particles described by Y” (21, £2) ‘prefer’ to crowd 
together in places where x; = 29. 


Figure 4.6b shows the probability density for the antisymmetric spatial wave 
function. By contrast, this is equal to zero along the xı = x2 line. The maxima of 
this probability distribution occur when x; and x2 have very different values. It is 
as if the particles were mysteriously repelled, although there is no force between 
them. We might say that particles described by (x1, £2) ‘prefer’ to avoid each 
other. 


We emphasize again that the ‘crowding’ and ‘avoidance’ of particles displayed in 
Figure 4.6 are purely quantum-mechanical effects resulting from the identity of 
the particles. It has nothing to do with forces because the Hamiltonian operator of 
the system contains no term V (xı — x2) that would describe a mutual interaction 
between the particles. 


Both Figures 4.6a and 4.6b apply to bosons and fermions. Figure 4.6a assumes a 
symmetric spatial wave function. Since identical bosons have a symmetric total 
wave function, this case applies to identical bosons in a symmetric spin state (or 
to spinless bosons). Since identical fermions have an antisymmetric total wave 
function, it also applies to identical fermions in an antisymmetric spin state. For a 
pair of electrons, this implies a triplet spin state. By contrast, Figure 4.6b assumes 
an antisymmetric spatial wave function; this is found for identical bosons in an 
antisymmetric spin state, or identical fermions in an symmetric spin state. For a 
pair of electrons, this implies a singlet spin state. 


4.3 Consequences of particle indistinguishability 


4.3.1 The Pauli exclusion principle 


The rules applying to fermions, stated in the previous section, have a momentous 
consequence: the Pauli exclusion principle, proposed by Wolfgang Pauli in 1925. 
Initially formulated in terms of electrons, this principle applies to all fermions and 
can be stated as follows: 


Pauli exclusion principle 


No two identical fermions can exist in the same quantum state. 


This is equivalent to saying that no two fermions in a system can ever have 
exactly the same set of quantum numbers. It is understood that both the state, and 
the set of quantum numbers, include a specification of spin. As you will see in 
Book 3, the Pauli exclusion principle is crucial for explaining atomic spectra, 

the Periodic Table, and the bonding of atoms to form molecules. It also helps 

to explain the behaviour of nuclei, white dwarf stars and neutron stars, and 
determines many of the properties of the matter around us. 


Figure 4.7 Wolfgang Pauli 
(1900-1958). Pauli was awarded 
the Nobel prize for physics in 
1945 for his discovery of the 
exclusion principle. 
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The Pauli exclusion principle follows from the fact that identical fermions must be 
described by an antisymmetric total wave function. As you saw in the previous 
section, this implies that two electrons with the same z-component of spin, Sz, 
cannot be in the same spatial state Yn (x). This is because two electrons with 

the same value of S, are in a symmetric spin state, so the spatial part must be 
antisymmetric. However, there is no (non-zero) antisymmetric spatial wave 
function having both particles in the same spatial state, yn (x). The label n here 
could represent a single quantum number or several quantum numbers in the 
three-dimensional case. We therefore conclude that not all the quantum numbers 
can be the same. There is no such restriction if the pair of electrons is in an 
antisymmetric spin state but, in this case, the two spin quantum numbers, m,, and 
Ms, are different, so the two electrons never have exactly the same set of quantum 
numbers. This property, described here for electrons, applies to any number of 
identical fermions: it is the Pauli exclusion principle. 


4.3.2 Rigidity of metals 


The Pauli exclusion principle applies to all systems of identical fermions. The 
electrons in a metal provide an example of such a system. The atoms in a metal 
lose one or more electrons, becoming positively-charged ions. The ‘free’ electrons 
then move through the whole body of the metal. Although the electrons are 
subject to the mutual repulsion and attractive forces of the ions, various properties 
of metals can be explained by treating the electrons as free particles. 


We can model the behaviour of free electrons in a metal as follows. The electrons 
are treated as non-interacting particles of mass m within a metal cube having sides 
of length L. Within this cube, the potential energy of the electrons can be taken to 
be zero, with the sides of the cube constituting an infinite barrier. The energy 
levels of an electron are therefore just those of a particle in a three-dimensional 
infinite square well: 
(n2 + it, +n?) n7h? 
2mL? 
Here nz, Ny and n, are the quantum numbers associated with the coordinates x, y 
and z, respectively. Due to the Pauli exclusion principle, no more than two 
electrons can have the same value for these three quantum numbers. (If all three 
spatial quantum numbers are the same, then the electrons must have different spin 
quantum numbers: Ms = 1/2 and ms = —1/2). Hence the quantum states in the 
box are filled in a very specific way: electrons cannot all go to the lowest spatial 
state (defined by nz = ny = n, = 1). A maximum of two electrons will occupy 
this spatial state. The next two electrons must occupy a different spatial state 
(with at least one of the quantum numbers being different from 1); the electrons 
therefore pile up into higher and higher levels. As a result, no matter how much 
we try to lower the energy of the electrons in the metal, there will always be 
electrons with large energies. 


E- (4.43) 


This is one reason why metals resist being compressed. If we try to compress the 
metal, and therefore make L in Equation 4.43 smaller, the energies of all the 
electrons increase. The electrons cannot lose energy by dropping down into lower 
energy levels because these are all full. The increase in energy requires work done 
by the compressing force — so much work, in fact, that it is hard to achieve an 
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appreciable decrease in volume. The same mechanism is responsible for the 
stability of white dwarf stars. A white dwarf star is rather like a huge piece of 
metal, with all the electrons in a giant potential energy well the size of a star. 
There is a tendency for gravity to compress the star, but a significant compression 
would be accompanied by a huge increase in the energy of the star, more than is 
available from the loss of gravitational potential energy, so the white dwarf star 
resists compression and remains in near-equilibrium. 


Exercise 4.15 Consider a three-dimensional box containing many electrons. Is 
it correct to say that the Pauli exclusion principle implies that there can only be a 
maximum of two electrons with the same energy? i 


4.3.3 Bose—Einstein condensation 


In Chapter 4 of Book 1 we mentioned an experiment in which about 2000 
rubidium atoms were trapped by a magnetic field and cooled down so that 

a large number of them occupied the ground state of the well in which they 

were confined. The resulting ‘substance’ is a new phase of matter called a 
Bose-Einstein condensate (BEC). What is this phase of matter? Why and how is 
it formed? And more importantly, why is it interesting? 


The existence of Bose-Einstein condensates was predicted by Einstein in the 
1920s. His prediction concerns the low-temperature behaviour of a gas of atoms 
that are bosons. When such a gas is so cold that the de Broglie wavelength yp of 
the atoms is comparable to the spacing between them, the atoms behave in a 
collective way, and the whole gas can be described by a single wave function. In 
this sense, the atoms lose their individuality. 


We define the atom number density, N, as the number of atoms per unit volume. 
Then, if we call the mean inter-atom separation d, we can set d = N -1/3 because 
there will, on average, be one atom within a volume Æ. The criterion for the 
formation of a Bose—Einstein condensate is that 


NN Z 2.612. (4.44) 


Consider what happens to atoms that are bosons trapped in a small box. We 
assume that every atom is in its internal ground state, an accurate assumption at 
very low temperatures. The kinetic energy of the atoms is quantized since the box 
acts as a three-dimensional infinite square well. At room temperature, there is a 
distribution of particles over all the possible energy levels. But, as the temperature 
falls, the atoms move from higher energy levels to lower energy levels; their 
average speed decreases and their average de Broglie wavelength increases. Now 
the fact that the atoms are bosons comes into play. There is no reason why the 
bosons should not all be in the same quantum state. In fact, it can be shown 

that bosons ‘prefer’ to be in the same state (much as they ‘prefer’ to crowd 
together). It turns out that at some (very low) critical temperature, a phase 
transition takes place in which many of the atoms ‘condense’ into the lowest 
energy single-particle state. These atoms form the Bose-Einstein condensate, 
which behaves in many respects like a single entity, described by a so-called 
macroscopic wave function YẸ (r,t). The remarkable point about this wave 
function is that it depends only on a single position vector r, not on the position 
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At temperatures much higher 
than those required for 
Bose-Einstein condensation, 
most gases suffer a phase 
transition into a liquid, and then 
solid, state. To produce a 
Bose-Einstein condensate, these 
‘normal’ transitions must be 
avoided. 


122 


vectors of all the particles in the condensate. It turns out that the number density 
of particles in the condensate is given by |Y (r, t)|?. The process of Bose-Einstein 
condensation is represented schematically in Figure 4.8. 
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Figure 4.8 The formation of a Bose-Einstein condensate. (a) At high 
temperatures, the average distance between atoms is larger than their typical 

de Broglie wavelength and the atoms behave like individual particles. (b) As the 
temperature decreases, the de Broglie wavelength increases. (c) At a critical 
temperature, the de Broglie wavelength becomes comparable to the distance 
between atoms, and a Bose—Einstein condensate, described by a macroscopic 
wave function U(r, t), starts to form. (d) At a temperature close to absolute zero, 
practically all the atoms belong to the condensate. (Adapted from Wolfgang 
Ketterle’s Nobel lecture (2001).) 


It was not until 1995 that a gaseous Bose-Einstein condensate was finally realized 
experimentally. We can say a few words about how this was done. A gas of atoms 
that are bosons will undergo a phase transition to a Bose-Einstein condensate 

if Equation 4.44 is satisfied, which suggests high number densities and long 

de Broglie wavelengths. However, there is an upper limit to the number density: 
when atoms are very close, they start to interact with one another and form 
molecules or undergo other phase transitions. This puts a limit on the minimum 
distance d between the atoms. Since d = N13, there is an upper limit to the 
number densities NV that we can use. We are left with increasing the de Broglie 
wavelength. Since Aag increases as the atomic speed falls, the temperature must 
be very low, close to absolute zero. 


In 1995, Eric Cornell and Carl Wieman cooled STRb atoms down to around 

1.7 x 1077 K by using lasers and a technique called evaporative cooling. Soon 
afterwards, Wolfgang Ketterle produced a condensate of sodium atoms. Because 
Ketterle’s condensate contained more atoms, he was able to split it into two and 
then let the two parts interact to form an interference pattern (Figure 4.9), rather 
like the pattern found when light passes through a double slit. This pattern 

is evidence that the Bose-Einstein condensate can be described by a single 
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macroscopic wave function. Cornell, Wieman and Ketterle won the Nobel prize 
for physics in 2001 for the achievement of Bose-Einstein condensation. 


Figure 4.9 Interference fringes produced by the superposition of two 
Bose-Einstein condensates consisting of sodium atoms. The fringes represent 
variations in the density of the gas due to the interference of the wave functions 
representing each condensate. 


Much research has been devoted to ultra-cold atoms since the first Bose-Einstein 
condensate was created. Remarkably, condensates containing atoms that are 
fermions have also been produced. How is this possible? The Pauli exclusion 
principle prevents fermions from condensing into the lowest energy state. The 
trick is that pairs of fermion-type atoms can join to produce molecules that are 
bosons, and it is these molecules that form the condensate. 


Summary of Chapter 4 


Section 4.1 The Hamiltonian operator f = Hy + Hy for a system of two 
non-interacting particles is a sum of Hamiltonian operators associated with the 
individual particles. It acts on a wave function Y (x1, £2, t) that depends on the 
coordinates of both particles. 


For two distinguishable particles, Born’s rule states that the probability of finding 
particle 1 in a small interval 6x1, centred on x1, and particle 2 in a small interval 
dxg, centred on 2a, is |W (x1, £2, t)|? 6x1 6x2. 


Separating the space and time variables in Schrédinger’s equation, 

we obtain stationary state solutions which are the product of an energy 
eigenfunction (x1, £2) and a time-dependent phase factor, e~i”*/". 

Solving the time-independent Schrödinger equation for a system of two 
non-interacting distinguishable particles gives energy eigenfunctions of the 
form Y(x1, £2) = Un(21) Yk(x£2), where n and k are quantum numbers of 
single-particle states with eigenvalues En and Eg, and En + Ey, = E. For 
distinguishable spin-5 particles, a more complete description is provided by the 
total wave function %1,2 = Y(£1, £2) |mMs,,™Ms,). Operators that depend on 
coordinates and momenta act on the spatial part of the total wave function, while 
spin operators act on the spin part. 
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Section 4.2 In the quantum world, all particles sharing the same set of 
unalterable attributes are identical. All electrons are identical, for example. Any 
particle can be classified as being either a boson or a fermion. For identical 
bosons, the total wave function is always symmetric (unchanged by swapping a 
pair of particle labels); for identical fermions, the total wave function is always 
antisymmetric (reversed in sign by swapping the particle labels). 


The spatial part of the total wave function must either be symmetric (7* (21, £2)) 
or antisymmetric (Y7 (x1, £2)), where 

1 
V2 


Both these possibilities are consistent with the fact that the probability density is 
independent of the scheme used to label the particles. 


W* (@1,£2) = =[Yn(£1) Ye(22) + Ye (21) Yn(x2)]. 


The spin part of the total wave function must also be symmetric or antisymmetric. 
For identical spin-3 particles, the possibilities are: 


symmetric: zl WED 11), 110, 
antisymmetric: ij tl) —| 11)). 


V2 


A2 A 
These spin states are eigenfunctions of the total spin operators S and S-, and they 
can be classified in terms of the eigenvalues S(S + 1)h? and Mgh of these 
operators. Writing them in the same order as before, we have: 


symmetric: |1,0), 11,1), [1,-1), 
antisymmetric: |0, 0). 


The three symmetric states are called triplet states and have S = 1, and the 
antisymmetric state is called the singlet state and has S = 0. 


Bosons are particles of integer spin (s = 0,1,2,...). Fermions are particles of 
half-integer spin (s = 1/2,3/2,5/2,...). For example, photons are bosons, and 
electrons, protons and neutrons are fermions. Composite particles made up of an 
odd number of fermions are fermions; those made up of an even number of 
fermions are bosons. Composite particles made up exclusively of bosons are 
bosons. 


Systems of bosons are represented by symmetric total wave functions, so the 
spin and spatial parts have the same symmetry (both symmetric or both 
antisymmetric). Systems of identical fermions are represented by antisymmetric 
total wave functions, so the spin and spatial parts have opposite symmetries. A 
pair of electrons in a symmetric (i.e. triplet) spin state is represented by an 
antisymmetric spatial wave function ~ (21, x2). A pair of electrons in the 
antisymmetric (i.e. singlet) spin state is represented by a symmetric spatial wave 
function Y™ (21, £2). 


The probability densities for systems of identical particles differ markedly from 
those for non-identical particles. Both the overall spin state and the boson/fermion 
nature of the particles matter. Identical bosons in a symmetric spin state 

show a marked tendency to ‘crowd together’, as do identical fermions in an 
antisymmetric spin state. However, identical bosons in an antisymmetric spin 
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state and identical fermions in a symmetric spin state show the contrary tendency: 
they avoid each other. These effects arise from the symmetry or antisymmetry of 
the wave function and are not connected with forces acting between the particles. 


Section 4.3 As a result of the antisymmetry of their total wave function, 
identical fermions obey the Pauli exclusion principle: no two identical fermions 
can be in the same quantum state, with the same set of quantum numbers 
(including those associated with spin). The Pauli exclusion principle has profound 
effects on the chemical and physical properties of matter. For example, it explains 
the incompressibility of metals and the stability of white dwarf stars. 


A Bose-Einstein condensate is a phase of matter formed by boson-type atoms 
at extremely low temperatures, when their de Broglie wavelengths become 
comparable to the inter-atomic spacing. At a critical temperature, a phase 
transition takes place in which many of the atoms ‘condense’ into the lowest 
energy single-particle state. These atoms form the Bose—Eintein condensate, 
which behaves in many respects like a single entity, described by a macroscopic 
wave function, U(r, t). 


Achievements from Chapter 4 


After studying this chapter, you should be able to: 
4.1 Explain the meanings of the newly defined (emboldened) terms and 
symbols, and use them appropriately. 


4.2 Write down the Hamiltonian operator for a system of two particles, and the 
corresponding Schrödinger equation. 

4.3 Explain why solutions of the time-independent Schrodinger equation for a 
system of two non-interacting distinguishable particles can be written as 
products of two single-particle eigenfunctions, each depending only on the 
coordinate of one of the particles. 


4.4 Find the effects of operators acting on the spatial and spin parts of total wave 
functions. 


4.5 Write down possible spatial wave functions for a system of two identical 
fermions or bosons. 


4.6 Write down possible state vectors for a system of two identical spin-} 
particles using ket notation. Assign the associated quantum numbers S' 
and Ms, and identify the spin states as being singlet or triplet. 


4.7 Combine spatial wave functions and spin kets to produce the total wave 
function of a system of two identical spin-3 particles. 


4.8 Identify whether particles are bosons or fermions, given information about 
their spin or composition. 


4.9 Relate the symmetry of the total wave function describing a system of 
identical particles to the boson or fermion nature of its particles. 


4.10 Explain how the probability density of a pair of identical particles is affected 
by the symmetry or antisymmetry of their spatial wave function. 


4.11 State the Pauli exclusion principle and describe some of its consequences. 


4.12 Describe the formation and behaviour of a Bose—Einstein condensate. 
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Chapter 5 The principles of quantum mechanics: 
a review 


Introduction 


Chapter 2 of Book 1 presented a list of statements described as ‘Preliminary 
principles of wave mechanics’. You have seen a great deal of quantum mechanics 
since then, including additional principles, new notation and many examples of 
quantum mechanics in action. Now is a good time to pause and take stock. 


This chapter contains little that is new; much of it will draw together ideas 
introduced in Book 1 and the first four chapters of this book. However, there will 
also be some deeper discussion of what happens when we make a measurement in 
quantum mechanics and how we deal with continuous ranges of possible values. 
The aim is to arrive at an updated list of the principles of quantum mechanics, and 
to prepare the way for the remarkable physics in the last two chapters of this book. 


We begin by reproducing the list of preliminary principles of wave mechanics, 
exactly as they appeared in Chapter 2 of Book 1. 


Box |: Preliminary principles of wave mechanics 


1. The state of a system at time t is represented by a wave function Ẹ (x, t). 


2. An observable, such as energy or momentum, is represented by a linear 
operator, such as —iñ O/Ox for the momentum component pz. 


3. As a general rule, the only possible outcomes of a measurement of 
an observable are the eigenvalues of the associated operator. 


4. The time-evolution of a system in a given state is governed by 
Schrédinger’s equation. 


5. A measurement will cause the collapse of the wave function — a sudden 
and abrupt change that is not described by Schrédinger’s equation. 


Now it is clear that these principles need to be overhauled and extended. Take the 
first principle, for example, which refers to a wave function U(x, t). Such a 
function may describe the state of a single spinless particle travelling along the 
x-axis, but it cannot describe the state of a system of many particles in three 
dimensions, or even the state of a single particle with spin. It is also clear that 
some key quantum-mechanical ideas are missing, including the rules needed to 
predict the probabilities of particular experimental outcomes and the rules 
describing the special properties of systems of identical particles. 


We will go through these principles, reviewing and extending them in 

Sections 5.1-5.5. Finally, in Section 5.6, we will draw together a revised list of 
principles of quantum mechanics. The change in title is significant: the list will no 
longer be preliminary, and it will encompass the whole of quantum mechanics, 
not just the wave mechanics of Book 1. 
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5.1 Describing the state of a system 


In the context of wave mechanics, the state of an isolated system is described by a 
wave function. The variables in the wave function depend on the type of system 
under study: 


e For a particle in one dimension, the wave function can be written as U(x, t), a 
complex-valued function of a single position coordinate, x, and time, t. 


e For two particles in three dimensions, the wave function can be written as 
W(ri,r2,t) or as U(r1, Y1, 21, 22, Y2, 22, É). 


The wave function for an atom with many electrons depends on the position 
vectors of all the electrons as well as time: U(r), r2,...,rn,t). The wave 
function therefore does not have a (complex) value at each point in space, r. This 
is unlike an electric field, for example, which has the value E(r, t) at position r 
and time t. 


In order to be consistent with Born’s rule, each wave function must be normalized: 
the square of its modulus, integrated over all the coordinates of all the particles 
must be equal to 1. For a single particle in three dimensions, this means that 


T \W(r,t)|\?dV = 1. (5.1) 


=00 


For a pair of particles in one dimension, it means that 


f / |U (z1, £2, t)|? day dzə = 1, (5.2) 


where dx, and dr are infinitesimal intervals associated with the positions of 
particles 1 and 2, and both integrals extend over the whole of the x-axis. As 
the number of particles increases, the normalization integrals become more 
complicated to write down, but the principle remains the same. 


Given a wave function, we can extract information from it and make (generally 
probabilistic) predictions. However, there is some redundancy in the way a 
wave function is specified. Any two wave functions that differ by an overall 
multiplicative factor e'”, where a is real, represent the same state. A factor 
like e!” (a complex number of unit modulus) is called a phase factor. When 
normalizing a wave function, any phase factor can be chosen. 


Exercise 5.1 For given functions f(x) and g(x), which of the following wave 
functions represent the same state at t = 0? 


Yı = f(x) +ig(z), Y2=if(x)+g(x), VYs=if(r)—g(x). m 


The notation of wave functions is explicit but cumbersome. At the beginning of 
this book we introduced a shorthand notation, due to Dirac, which represents 
states by vectors in an abstract vector space. The state of a system is then 
represented by a state vector, written as |W). Along with this notation comes the 
idea of defining an inner product between vectors. In one-dimensional wave 
mechanics, the inner product is defined by 


(oh) = f “Pois (5.3) 
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where the symbol (¢|~) is called the Dirac bracket of the functions (x) 
and y(x). Using this shorthand notation, the normalization integrals in 
Equations 5.1 and 5.2 can be written simply as 


(UW) =1. (5.4) 


The irrelevance of phase factors means that e'“|W) represents the same state 
as |W). We can immediately see, for example, that the phase factor has no effect 
on the normalization: 


(ei Pei) = ete (HY) = (UV) = 1. 


Obviously, much detail is omitted, but this is a strength of Dirac notation, allowing 
us to see the wood for the trees. The concept of a vector space also fits in naturally 
with the principle of superposition: if |Y,) and |W) represent two possible states 
of a system, then so does any properly-normalized linear combination 


|W) = a; |W) + az |W2). (5.5) 


This principle explains all types of interference in quantum mechanics, including 
the famous two-slit interference of electrons. 


Moving beyond wave mechanics, particles can have spin, and the spin state of a 
spin-5 particle is described not by a wave function but by a two-component 
matrix, called a spinor. Each spinor represents a vector in a complex 
two-dimensional space (spin space). For example, the spinors for spin-up and 
spin-down along the z-axis are written as 


a= lo] amd nas fil 


and any spin state |A) of a spin-ż particle can be written as 


ai 
|A) = ay | Te) + az | he) = H ` (5.6) 
This means that vector space ideas, introduced as a shorthand in wave mechanics, 
carry over directly into spin space. The main difference is in the definition of the 
inner product. For a spin-3 particle, this is defined as 


b * * 
(A|B) = [a] a3] A = aĵïbı + agbe. (5.7) 


So bra and ket vectors provide a general notation for quantum mechanics, which 
can be interpreted using either wave functions or spinors, according to context 
(Table 5.1). But, no matter what kind of system is being considered, bra and ket 
vectors always obey the same rules. For example, the inner product always 
satisfies 


(A|B) = (BIA)”. (5.8) 


5.1 Describing the state of a system 


Table 5.1 General notation, wave mechanics notation for a particle in one 
dimension, and spinor notation for the spin state of a spin-5 particle. 


Property General Wave mechanics Spin 

Ket vector | A) Wa(z, t) E 

Bra vector (A| Yh (x, t) [aï a3] 
Inner product (A|B) f W(x, t) Yg(z,t)dz [aï a3] a 


Normalization (A|A) = 1 [vie Wa(z,t)dx =1 Gi až] H —] 


In general, we need to describe both the spatial variation of a wave function 
and the spin state of a system. In Chapter 4, this was done by writing down a 
total wave function — the product of a spatial wave function and a ket vector 
describing the spin. However, we could also use a single ket vector such as 

|W, S, Ms), where W labels the spatial behaviour, and S and Ms label the spin. 
So Dirac notation is flexible enough to describe any state in quantum mechanics. 


Identical particles 


Chapter 4 of this book introduced a remarkable feature of the quantum world: 
identical particles. All particles of a given kind (e.g. all electrons) are identical, 
and the effect on the description of states is profound. 


It turns out that all particles in Nature fall into two categories: fermions and 
bosons. Fermions have a spin quantum number s that is equal to an odd number 
times 1/2, while bosons have s equal to an integer (which may be zero). Any 
composite particle built up of an odd number of fermions is a fermion, while any 
particle built up of an even number of fermions is a boson. 


If a system is made up of identical fermions, its total wave function (including 
both space and spin parts) is antisymmetric with respect to interchange of particle 
labels. A consequence of this antisymmetry is the Pauli exclusion principle, 
which prevents more than one fermion from occupying the same quantum state 
and has a decisive effect on the properties and stability of matter. 


If a system is made up of identical bosons, its total wave function is symmetric 
with respect to interchange of particle labels. At low temperatures, this can lead to 
the phenomenon of Bose-Einstein condensation, which results in many atoms 
occupying the same quantum state; all of these atoms can be described by a single 
macroscopic wave function, dependent on a single position vector. 


Exercise 5.2 Isa silver atom a fermion or a boson? The two stable isotopes are 
107 Ag and !9%Ag, and silver has atomic number Z = 47. Go 
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5.2 Describing observables 


The second principle in Box | tells us that each observable quantity A in quantum 
mechanics is represented by a linear operator A. The word ‘observable’ is used 
for any physical quantity that can be measured. Its use highlights the fact that the 
formalism of quantum mechanics includes some things that cannot be measured 
— the phase factor of a wave function, for example. 


The third principle in Box 1 highlights the key role played by operators in 
quantum mechanics: they determine the possible values of measurements. You 
may wonder why we included the phrase ‘As a general rule’ at the beginning of 
Principle 3. The reason is that there are some technical issues associated with the 
continuum which are best avoided in a first reading of the subject. We will finally 
confront these issues in Section 5.5 of this chapter, but can safely ignore them for 
the moment. 


In wave mechanics, operators act on functions to produce new functions. These 
operators typically involve the act of differentiating the function, or multiplying it 
by a constant or by some other function. For example, the operator that represents 
the x-component of momentum is 

pe z0 

Py = —iħ a (5.9) 
By contrast, the operators that describe spin observables are square matrices. For 
a spin-4 particle, the operator representing the spin component in the direction of 
the unit vector n, defined by the spherical coordinates 0 and @, is 


a cosð e? | 


Sn = 5 e?sin@ —cosé 


5 (5.10) 


and this acts on column spinors through ordinary matrix multiplication to produce 
new column spinors. 


Although Sn looks very different to P}, both operators are linear, with the general 
feature that 


A(S ails) = D ai (AN) (5.11) 


for any constant coefficients a;. 


We can now add a further requirement for the operators that represent observables 
in quantum mechanics: they must be Hermitian as well as linear. This fits in with 
Principle 3, which tells us that the possible values of an observable are given by 
the eigenvalues of the corresponding operator. Measured values are always real 
(rather than complex), and this is guaranteed by the fact that Hermitian operators 
have real eigenvalues. 


In Dirac notation, an operator A is said to be Hermitian if 


(Ui |Ade) = (Avi |v2) (5.12) 


for any normalizable functions 7 and 72. In the context of one-dimensional 
wave mechanics, this means that 


[nonoa f 


—oco 


oO 


[A tn (2)| “do(x) der, (5.13) 


5.3 Predicting the probabilities of outcomes 


for all normalizable functions y (a) and w2(a). In Chapter 1, we proved that the 
momentum operator is indeed Hermitian, using this definition. 


In the context of spin, we note that the Mathematical toolkit shows that any 2 x 2 
matrix that behaves as a Hermitian operator must be of the form 
_ {Au Ate . E 
A= be Ass with Aj; = Aj;. (5.14) 
Such matrices are said to be Hermitian. The general spin matrix Sn in 


Equation 5.10 is Hermitian because its diagonal elements are real and its 
off-diagonal elements are complex conjugates of one another. 


But how do we find the quantum-mechanical operator A that describes a given 
observable A? It is generally a good plan to write down a classical expression for 
the observable A in terms of Cartesian position and momentum components, and 
then make replacements such as 


de i 0 A 
Pr => Py, = -ih — and TESST 
Ox 
For example, the x-component of orbital angular momentum is represented by the 
operator 
a o o 
Lz = 9P, — ZP, = —ih z ; 


We cannot use this procedure for the spin components of spin-3 particles because 
there is no classical starting point in this case: spin is an entirely quantum 
property. Instead, we impose constraints based on the experimentally observed 
values (+f /2), the need for spin matrices to be Hermitian, and the assumption 
that they obey commutation relations similar to those of the orbital angular 


momentum operators Le. Ly and A With appropriate conventions, these 
assumptions lead to the mad in Equation 5.10, but of course the ultimate 
justification of this matrix is that it leads to results that agree with experiment. 


Exercise 5.3 Given that y, Z, p, and p, are all Hermitian operators, show that 


a~ 


Ls = Yp, — 2P, is also Hermitian. E 


5.3 Predicting the probabilities of outcomes 


Box 1 contains a very significant omission which goes right to the heart of 
quantum mechanics. It is generally impossible to predict the result of an 
individual measurement, but if we know the state of a system, quantum mechanics 
allows us to predict the probabilities of different outcomes. The principles 
covering this aspect of quantum mechanics were first discussed in Chapter 4 of 
Book 1 and were illustrated again in the context of spin. These developments are 
important enough to be included in our revised list of the principles of quantum 
mechanics. 


To begin with, let us restrict attention to any observable A with a discrete set of 
possible values. The corresponding operator A has eigenvalue equation 


Ala;) = ailai) fori =1,2,3,..., 
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where |a;) is the normalized eigenvector corresponding to the eigenvalue a;. We 
shall assume that each eigenvalue a; has only one eigenvector |a;) (ignoring, as 
usual, the physically insignificant choice of overall phase factor). 


Any measurement of A gives one or other of the eigenvalues a1, a2,..., but 
what can we say about the probabilities of these outcomes in any given state? 
Expressing the results of Book 1 Chapter 4 in terms of Dirac notation, we have: 


The overlap rule 


For a system in a state described by the state vector |W), the probability that 
a measurement of A will yield the result a; is 


pi = |(ail¥)|*, (5.15) 


where |a;) is the eigenvector corresponding to the eigenvalue a;. 


In wave mechanics, the inner product in Equation 5.15 is found by integration. 
For example, in a harmonic oscillator, in a state described by the wave function 
Y (x,t), the probability of getting the energy eigenvalue Æ; is 


Pi = pi we (x) U(x, t) dx : 


where ;(x) is the energy eigenfunction corresponding to the eigenvalue E;. For 
spin, by contrast, the inner product is found by matrix multiplication. 


Exercise 5.4 A spin-5 particle is in a spin state described by the spinor | {,). 
If its spin is measured in the y-direction, what is the probability of measuring the 
value +f/2, for which the eigenvector is | 1y)? You may use the results 


Masihi] ma 1) = [i]. P 


So far, we have assumed that each allowed value corresponds to a unique 
eigenvector. This is not always the case; for example, a particle in a 
three-dimensional box has degenerate energy levels, with several different 
eigenfunctions corresponding to the same energy. 


Let us suppose that there are n orthonormal eigenvectors corresponding to a 
single energy eigenvalue E;. We denote these eigenvectors by |1;,1), |Wi,2), 
..., |Win). Then the probability of getting the energy eigenvalue Æ; in a state 
described by the state vector |Y} is given by 


pi = pial)? + [biol By)? +--+ Kinl T)? 


Notice that we add the probabilities associated with different eigenvectors. You 
might wonder whether this is correct. At the very beginning of the course, we 
mentioned the interference rule which involves adding probability amplitudes 
before taking the square of the modulus. This rule applies when there is more than 
one way of going from a given initial state to a given final state, but it does not 
apply here because the eigenvectors |~;,1), |Wi,2), .... |Win) all refer to different 
final states — not distinguished by their energy, but nevertheless different from 
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one another in some other way. For example, electrons in the ground state of a 
three-dimensional box can be spin-up or spin-down. 


So far, we have considered observables with discrete possible values. In wave 
mechanics, however, observables such as position and momentum have 

a continuum of allowed values, and this leads to difficulties with using 

Equation 5.15 directly. For example, it is futile to ask what the chances are of 
finding a particle at a single point in space; in practice, we measure positions 
only to within some finite resolution, so it is much more sensible to ask for 

the probability of finding a particle in a small range centred on a given point. 
Fortunately, Born’s interpretation of the wave function tells us this probability. In 
one dimension, the probability that the particle at time t is in a small interval ôx, 
centred on z, is 


probability that position is in small range = | W(x, t)|? ôx. (5.16) 


Similarly, the probability that the momentum of the particle at time ¢ is in a small 
interval A ok, centred on hk, is 


probability that momentum is in small range = | A(k, t)|? ok, (5.17) 
where A(k, t) is the Fourier transform of the wave function: 
1 oe 
A(k, t) = =f e W(x, t) da. 5.18 
(yt) = ef Mw(2,1 6.18) 


These equations can be thought of as extensions of Equation 5.15, adapted for the 
continuum, as outlined in Sections 4.2 and 6.4 of Book 1. 


Predictions made with certainty 


If a system is in a state described by |a;), an eigenvector of A with eigenvalue a;, 
the overlap rule tells us that the probability of measuring the value a; is 

| (ailai) |? = 1. The probability of measuring any other eigenvalue, a; Æ aj, 

in this state is therefore equal to zero. This agrees with the fact that any two 
eigenvectors of a Hermitian operator, corresponding to different eigenvalues, are 


orthogonal, so that | (a;|a:) |? = 0 for a; F aj. 
If two operators A and B commute with one another, so that 
(A, B] = ABBA =0, 
it is possible to find a set of ket vectors that are simultaneous eigenvectors of both 
operators. These eigenvectors describe states in which both A and B have definite 


values, and the observables A and B are then said to be compatible with one 
another. 


A2 = 

An example is given by the operators L and L,, which commute with one 
another and have a series of simultaneous eigenvectors labelled by the quantum 
numbers l = 0,1,2,... and m = —1,...,0,...,/. We have 


L°|d,m) = U(-+1)h? |l, m), 
L. |l,m) = mh|l,m). 
By contrast, L, and Ly do not commute. The corresponding observables Ly and 


L, are not compatible, and it is impossible to find states with definite (non-zero) 
values of both these quantities. 
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Expectation values 


Once we know the probabilities of all the possible values, we can evaluate 
expectation values and uncertainties, and these can be compared with 
experimentally-measured average values and standard deviations. 


For a discrete set of possible outcomes, the expectation value of an observable A 
is defined by 


(A) = Y fii (5.19) 


where the value a; has probability p;, and the sum runs over all the possible 
outcomes. 


Fortunately, there is an alternative way of calculating expectation values, which is 
usually simpler to use and applies to all observables, whether their values are 
discrete or not. In the state represented by |W), the expectation value of the 
observable A is given by 


(A) = (T| å |Y) = (WAY). (5.20) 


For obvious reasons, this is called the sandwich rule for expectation values. As 
usual, the inner product must be interpreted according to context, using wave 
functions or spinors as appropriate; Table 5.2 shows some examples. 


Table 5.2 Expectation values of position x and momentum py 
in one-dimensional wave mechanics, and the expectation value 
of S, for a spin-5 particle in the spin state | 1y). 


Observable General Specific 


x (W| x |W) T P*(x,t)x P(x, t)dz 


—oco 


pe (BW) f Penin Evet) ae 


=00 


So lw aE Aih -li 


Exercise 5.5 Evaluate the matrix product in the last row of Table 5.2. What 
does your answer tell you about the individual probabilities of getting S, = +h/2 
and S, = —h/2 in the state represented by | 1y)? a 


5.4 Time-dependence of states and measurement 


The last two principles in Box 1 describe the time-dependence of quantum states, 
contrasting two very different types of behaviour. 


5.4 Time-dependence of states and measurement 


Schrodinger’s equation 


First, let us review the time-dependence predicted by Schrédinger’s equation. 
Expressed in bra-ket notation, this states that 


d ~ 
ih —|V) = H|v 21 
in |W) = ÊW), (5.21) 


where H is the Hamiltonian operator for the system. An ordinary (rather than 
partial) time derivative is used on the left-hand side because the ket vector |W) 
contains no explicit variables. 


In wave mechanics, Schrédinger’s equation is a partial differential equation, 
involving a first-order derivative with respect to time, and second-order 
derivative(s) with respect to position coordinate(s). You saw many examples of 
this in Book 1. 


In the context of a spin-3 particle in a magnetic field B = Bn, the Hamiltonian 
operator is a 2 x 2 matrix 


ysBh[ cosð e? sind 


Bao a e sind —cosé |’ 


(5.22) 
where y; is the spin gyromagnetic ratio for the particle. In this case, Schrodinger’s 
equation is a matrix equation giving the rates of change of the components of the 
spinor that describes the spin state of the particle. 


For any isolated system, Schrédinger’s equation determines the time-development 
of the state vector. Given the state vector at time t = 0, we can use Schrodinger’s 
equation to predict the state vector at a later time. 


A procedure for this was outlined in Book 1 in the context of wave packets. In a 


harmonic oscillator, for example, we write the initial wave function as a linear 
combination of energy eigenfunctions: 


U(x,0) = f ei giz). (5.23) 
1=0 


The coefficients c; are determined using the fact that the energy eigenfunctions 
are orthonormal. Finally, we insert appropriate time-dependent factors into each 
term of the sum, to obtain 


Wej=S anaje m, (5.24) 
i=0 


where F; is the energy eigenvalue corresponding to the eigenfunction y; (x). 
Exactly the same procedure applies to spin, except that the eigenfunctions are now 
eigenvectors, and there are only two of these for any given direction of the 
magnetic field. 


Schrédinger’s equation is linear. This means that if |W1), |W2),... are solutions 
of Schrédinger’s equation, then so is the linear combination 
(Y) = c1 (i) + c2 [Y2 +, 


where the c; are constants. This property ensures that the right-hand side of 
Equation 5.24 satisfies Schrödinger’s equation, because it is a linear combination 
of stationary-state wave functions which are themselves solutions of the equation. 
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Figure 5.1 John von 
Neumann, 1903-1957. 
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Schroédinger’s equation also has the property of preserving the normalization of 
the state vector (whether this is a wave function or a spinor). If the state vector is 
normalized at t = 0, it will remain normalized as the system evolves in time. One 
way of establishing this result was given in Exercise 1.12 of Chapter 1, where the 
Hermitian character of the Hamiltonian operator was used to show that 

d 

dt 
This property is important because principles such as the overlap rule or Born’s 
rule, which assign probabilities to possible experimental outcomes, rely on the 
state vector being normalized at the instant of measurement. 


(U|v) =0. 


Measurement and collapse of the state vector 


So long as Schrédinger’s equation applies, the time-development of the wave 
function is completely deterministic: given V(x, 0), we can predict what Y (x, t) 
will be. However, we know that quantum physics as a whole is indeterministic: in 
most cases, it is impossible to predict the result of a measurement carried on a 
system. We can predict the probabilities of various possible outcomes, but we 
cannot say which outcome will occur on any given occasion. How can we 
reconcile the determinism of Schrédinger’s equation with the indeterminism of 
quantum measurements? The answer is that we cannot; the act of measurement 
causes the wave function to change uncontrollably and unpredictably, in a way 
that is not governed by Schrédinger’s equation. 


The word ‘measurement’ never appears in a list of principles of classical physics; 
it appears in laboratory procedures but not in the fundamentals of the theory. In 
quantum mechanics, however, the concept of measurement plays a key role in the 
theory itself. In classical physics, measurement is thought of as revealing some 
pre-existing property of a system, but quantum objects do not have properties until 
they have been measured, and, furthermore, the values returned by a quantum 
measurement are those of the system after the measurement, not before. 


You can think of a quantum measurement as an interaction or communication of 
information between a quantum system and a measuring device which is treated 
classically. If we use a meter with a pointer, for example, we ignore the fact that 
the uncertainty principle implies some uncertainty in the position of the pointer. 
The measuring device is supposed to be sufficiently large for its own quantum 
fluctuations to be neglected. We will treat the measuring device as a sort of black 
box, and not ask too closely what happens inside it. We can say, however, that a 
measurement occurs when a quantum system causes some sort of irreversible 
change in the measuring device, and possibly in its surroundings. For example, 

a Geiger counter may click, causing a sound wave to travel outwards in the 
surrounding air, which heats up very slightly. This process cannot be undone; it is 
irreversible. 


As a result of the measurement, two things happen, First, we get an experimental 
result — a reading on a meter or a click in a counter. Secondly, the state of the 
system changes abruptly and drastically. This process is called the collapse of the 
wave function or, more generally, the collapse of the state vector. It seems to 
have been first introduced in lectures given by Heisenberg in 1929, but is also 
associated with Dirac and, especially, von Neumann (Figure 5.1), who gave the 
first rigorous mathematical treatment of quantum mechanics. 


5.4 Time-dependence of states and measurement 


Consider an observable A, represented by the operator A with a discrete set of 
eigenvalues a; and eigenvectors |a;). Suppose that we measure A in a state 
represented by the linear combination 


|W) = cı |a1) + c2 lag) ++. (5.25) 


We cannot predict the value of A that will be obtained in such a measurement, but 
we can say that the probability of getting the value a; is |c;|?. David Bohm has 
described this situation by saying that a quantum state has within it a set of 
potentialities (things which might come into being), and that the act of 
measurement actualizes one of these. 


But what does the measurement do to the state of the system? The general rule is 
as follows: 


The state of the system immediately after the measurement is represented by 
the normalized eigenvector |a;) that corresponds to the eigenvalue a; that 
was obtained in the measurement. 


If we happened to get the value a2, for example, then the state of the system 
immediately after the measurement would be |az). The transition 


|W) = cı |a1) + c2|a2)+--- => jaz) (5.26) 
is what constitutes the collapse of the state vector. 


The collapse can be pictured as a combination of the two processes shown in 
Figure 5.2: a projection of the state vector |W) onto the direction of one of 
the eigenvectors |a;) of A, anda re-scaling to produce the normalized 
eigenvector |a;). Knowledge of the initial state |Y} has been lost once the 
measurement has been made and the collapse onto |a;) has taken place. 

The collapse is not described by Schrédinger’s equation. Not unnaturally, 
some physicists are concerned about this and are actively seeking a clearer 
understanding of the measurement process. 


| 
| 
|v) |) 
| 
| Jas} 
— 
— 
Jai} cijas) cila) 
collapse projection rescaling 


Figure 5.2 The collapse of a state vector can be pictured as the combination of 
a projection and a re-scaling. 


So far, we have considered an observable with a discrete set of possible values. 
As usual, modifications are needed for observables, such as position, with a 
continuous range of possible values. Modern instruments can measure quite 
accurately the position of a particle emerging from a nuclear experiment, but the 
recorded position x, y, z, say, is always within some finite region dx dy dz. 


137 


Chapter 5 The principles of quantum mechanics: a review 


Figure 5.3 The position of a 
particle is determined to be 


between zo — Ax and zo + Az. 


The real part of the wave 
function is shown (a) before the 
measurement and (b) after the 
measurement. 
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Figure 5.3 illustrates how the wave function of a particle in one dimension 
collapses when a position measurement is made. Figure 5.3a shows the real part 
of the wave function prior to the measurement. Let us suppose that the position of 
the particle is measured to be between xo — Ax and zo + Az; then the wave 
function immediately after the measurement is of the form shown in Figure 5.3b. 
This is equal to zero outside the region where the particle has been located, and is 
similar in shape to the original wave function inside that region, but is normalized 
to unity. 


areal part of Y areal part of Y 


7 
To — Ag 


(b) 


Successive measurements 


If we measure A once and get the value az, and then measure A again immediately 
afterwards on the same system, we are bound to get the same value again. This is 
because the first measurement has caused the state of the system to collapse to 
|az), and we are certain to get the value az in this state. Of course, this makes 
good sense, since the second measurement simply corroborates the first. But what 
happens if we wait a while between the two measurements? There are two cases 
to consider. 


1. If Aisa conserved quantity, the operator A will commute with the 
Hamiltonian H, and these two operators will share a common set of 
eigenfunctions. Now, if the eigenvector |a2) is also an eigenvector of f, it will 
evolve as a stationary state and will therefore remain unchanged apart from an 
unimportant phase factor e~‘”2‘/", Tt will therefore remain an eigenvector of A 
with eigenvalue a2, and we can afford to wait a considerable time before making 
the second measurement; provided that the system is isolated, the value az will be 
certain, even after a considerable delay. 


2. More generally, however, if |az) is not an eigenvector of f, it will evolve as a 
linear combination of stationary states, and this implies significant changes. After 
a short time, the state of the system will no longer be described by |a2), but will 
be some linear combination of eigenvectors of A. The second measurement of A 
may then give a value different from az, and it may even give a value that would 
not have been possible in the original state, before any measurements were made. 


One consequence of the collapse of the state vector is that we end up with states 


5.4 Time-dependence of states and measurement 


that are quite different from those we started out with. This means that we 
cannot readily check the predictions of quantum mechanics by taking successive 
measurements on a single system. To confirm that a quantum-mechanical 
prediction for an expectation value is accurate, we really need to prepare a large 
number of systems in identical states, and take measurements on each of them. 
The average value of all these measurements should approach the predicted 
expectation value as the number of measurements becomes very large. 


Exercise 5.6 A particle is in the ground state of a simple harmonic well with 
energy Ep. Its position is measured to be x to within a small resolution. The 
energy of the particle is then measured, and the particle is found to have an 
energy E10, corresponding to the tenth excited state of the oscillator. The energy 
is then measured again after some delay, and the particle is still found to have 
energy E19. Explain these results. | 


Superposition matters 


When a photon is incident on a half-silvered mirror (Figure 5.4), we might 

suppose that interaction with the mirror would cause the state of the photon to 
collapse, either into a state that passes through 
the mirror, or into a state that is reflected by it. 


; H1 
This, however, can be shown to be false. 


adjustable 
mirror 


M1 


half-silvered 
/ mirror 


single-photon 
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He te D 
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Figure 5.4 A photon incident on a Figure 5.5 A Mach-Zehnder interferometer. 


half-silvered mirror. 


Figure 5.5 shows a more complicated device called a Mach-Zehnder 
interferometer; this consists of two half-silvered mirrors, H1 and H2, two ordinary 
mirrors, M1 and M2, and two detectors, Da and Db. You met this device in 
Chapter 1 of Book 1. 


The important point to note is that a suitable choice of path lengths can ensure that 
a photon arrives at detector Da with certainty. This cannot be understood if the 
photon travels along just one of the paths P1 or P2, because the half-silvered 
mirror H2 would then give only a 50% chance of arrival at detector Da. The 
photon must therefore emerge from the half-silvered mirror H1 in a superposition 
of states, and these states interfere with one another in such a way as to 
completely suppress any chance of detection by Db. This example shows that we 
must be very careful when claiming that a measurement, and the accompanying 
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state vector collapse, has taken place. In this case, the collapse occurs when one 
of the detectors counts a photon, and not before. 


A second example of superposition is provided by a beam of spin-4 particles all in 
the same spin state 


1 
Wo tz) +| Jz). (5.27) 


If these particles enter a Stern—Gerlach analyzer oriented in the z-direction, half 
of them will be measured to be spin-up and the other half spin-down. You 
might suppose that this incident beam would be indistinguishable from a beam 
containing a random mixture of spin-4 particles, half of which are prepared to be 
in the state | 1+), and the other half in the state | |.). If so, think again! 


|4) = 


The state |A) in Equation 5.27 is actually equal to | fx), a state that is certain to 
give +h/2 when spin is measured in the x-direction. So, if a beam of particles in 
this state enters a Stern—Gerlach analyzer oriented in the x-direction, we will find 
that every particle is measured to be spin-up. The same cannot be said for a beam 
containing a random mixture of particles in | 1+) and | |.) states; in this case, half 
the particles will be measured to be spin-up in the x-direction, and the other half 
will be measured to be spin-down. 


We can now briefly discuss the Schrédinger cat paradox, concocted by 
Schrödinger in 1935. Schrödinger asked us to imagine a cat that is enclosed in a 
box, along with a tamper-proof “diabolical device’ which will release a lethal 
poison when a Geiger counter clicks. Also inside the box is a tiny amount of a 
radioactive substance, whose decay would trigger the Geiger counter, leading to 
the demise of the cat. Suppose that the amount of radioactive substance, and its 
half-life, are such that there is a fair chance that the poison would be released in a 
matter of an hour or so. Now, suppose that we close the lid of the box and wait. 
How should we describe the state of the cat after one hour? Schrödinger pointed 
out that, prior to any measurement, quantum mechanics describes the cat as being 
in a linear superposition of two states — one in which it is dead and another in 
which it is alive. Only on opening the box and ‘making a measurement’ do we 
cause the cat’s state vector to collapse onto either a living cat or a dead cat! 


Just as there is a difference between two types of Stern—Gerlach beam discussed 
above, so there is a difference between the quantum-mechanical description (a 
linear superposition of an alive and a dead cat) and the more common-sense 
description in which the cat is either dead or alive, but we do not know which until 
we open the box. Naturally enough, Schrödinger found the quantum description 
unpalatable. 


Attempts to resolve this ‘paradox’ have been very fruitful, leading to a deeper 
understanding of the role of the environment in disturbing superposition states in 
macroscopic systems, and in suggesting experiments that explore the possibility 
of a boundary between the classical world and the quantum world. But, so far, 
there is no universally-agreed resolution of the paradox. 


5.5 Dealing with the continuum 


In the process of stating the basic principles of quantum mechanics, we have 
run into a number of problems concerning observables with a continuum of 
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possible values. These problems can be resolved at the expense of considerable 
mathematical complexity, beyond the level of this course. We can, however, 
sketch the main ideas. 


The possible values of an observable 


Box 1 stated that in a ‘as a general rule’, the only possible outcomes of a 
measurement of an observable are the eigenvalues of the associated operator. The 
‘general rule’ covers all observables with discrete values (e.g. Z? and L»), but it 
does not cover observables with a continuous set of possible values. In particular, 
there are difficulties with both position and momentum. 


The difficulty with position is the most severe, since the position operator X has no 
legitimate eigenvalues or eigenfunctions. This is because it is impossible to find a 
fixed constant A and a function f(x) such that 


R f(x) =xf(x)=Af(x) forall x, 


The difficulty with momentum is slightly more technical. In this case, the 
eigenvalue equation 


Paf) = -ih fla) = F(a) 


can be solved: the eigenfunctions can be expressed as f(x) = e'*”, and the 
corresponding eigenvalues are hk. However, these eigenfunctions do not describe 
realistic states because they cannot be normalized. Moreover, a sharp value, Ak, of 
the momentum, implies an infinite uncertainty in position, via the uncertainty 
principle. 


You might wonder whether this really matters. The answer is that it is not very 
important in non-rigorous treatments, but that rigorous treatments of quantum 
mechanics confine themselves to functions that can be normalized. One 
illustration of the need for this is the proof given in Chapter 1 that the momentum 
operator is Hermitian; this proof hinged on the use of normalizable functions. So 
if we really want to be rigorous, the momentum eigenfunctions are unsatisfactory 
too. 


In the absence of suitable eigenfunctions, how can we find the allowed values of a 
given observable? For the sake of completeness we will briefly sketch one way of 
doing this, although the details will not be assessed. A clue is provided by 
rewriting the eigenvalue equation A |f) = à |f) in the form 


(å-a) |f} = |0), (5.28) 


where |0) is the zero vector — a vector whose norm, ,/(0]0), is equal to zero. 
The normalized vectors |f} that satisfy this equation are the eigenvectors of A, 
and the corresponding constants are the eigenvalues. 


Our problem is that position, momentum, and other observables with a continuum 
of allowed values, do not have appropriate normalized eigenvectors satisfying 
Equation 5.28. The best we can do in such a case is to write down an equation like 


(A -AJ If) = In), (5.29) 


where À is a constant, |f} is a normalized vector, and |7), although small, is not 
quite equal to the zero vector. However, by adjusting the choice of the normalized 
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Any eigenvalue is also a 
generalized eigenvalue since 
zero is infinitesimally close to 
itself. 
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vector |f}, it may be possible to make the norm of |77) as small as we like — 
getting infinitesimally close to zero. If this is possible for a fixed A, we shall say 
that À is a generalized eigenvalue of A, and interpret it as being an allowed value 
of the observable A. 


Probabilities of experimental outcomes 


If an observable has discrete values, we can find the probability of any given value 
by using the overlap rule (Equation 5.15). However, this rule is inappropriate for 
observables with continuous values. From a practical point of view, experiments 
never resolve single values from a continuum — there is always some finite 
resolution. Moreover, the overlap rule is phrased in terms of normalized 
eigenvectors, which do not exist in the continuum. 


However, we can replace the normalized eigenvectors with the vectors |f} that 
approximate them in the sense of Equation 5.29, with \/(7|7) tending to zero. For 
a position measurement centred on xo, for example, we can choose |f} to be an 
extremely narrow normalized top-hat function centred on xo. This is precisely 
what was done in Section 4.2 of Book 1, and you saw that it led to Born’s rule for 
position. A similar procedure leads to Born’s rule for momentum, although we 
shall not go through the details. 


Collapse of the state vector 


We discussed this issue in the previous section, in the context of position 
measurements. We shall not attempt to generalize beyond this; in practice, the 
details of any collapse depend on details of the measuring device. We can say in 
general that a fine-resolution measurement of an observable with a continuum of 
possible values results in a narrow wave packet centred on the value obtained in 
the measurement, with a width determined by the resolution of the measurement. 


5.6 The principles of quantum mechanics 


Finally, as a summary, we can now give a more comprehensive list of the 
principles of quantum mechanics. We make no attempt to give a minimal list of 
axioms or postulates, from which the whole subject follows. Our aim is merely to 
collect together key principles which lie at the heart of the subject and inform 
many different aspects of it. It is convenient to depart slightly from the order of 
the principles listed at the outset of this chapter. 


States 


la The state of a system is specified by a normalized state vector |W). 


1b The vector e' |W), where a is a real number, represents the same physical 
state as |W). 


1c If |Y) and |W2) represent possible states of a system, and cı and c2 are 
complex constants, the normalized linear combination cı |W) + c2 |W2) 
also represents a possible state of the system. 


5.6 The principles of quantum mechanics 


States of identical particles 


2a All particles in Nature fall into two categories: fermions and bosons. For 
fermions, the spin quantum number s is equal to an odd multiple of 1/2; for 
bosons, s is equal to an integer (including zero). 


2b Composite particles with an odd number of fermions are fermions; 
composite particles with an even number of fermions are bosons. 


2c The total wave function of a collection of identical fermions is 
antisymmetric under exchange of particle labels; this leads to the Pauli 
exclusion principle. The total wave function of a collection of identical 
bosons is symmetric under exchange of particle labels; this leads to 
Bose-Einstein condensation at low temperatures. 


Observables 


3 Observables are represented by linear Hermitian operators. 


Measurements and their results 


The next three principles apply to observables with discrete sets of values: 


4a The possible measured values of an observable A are the eigenvalues of the 
corresponding quantum-mechanical operator, A. 


5a For a system in a state represented by the normalized state vector |W), the 
probability that a measurement of A will yield the result a; is 
m= |(a;|V) le where |a;) is the normalized eigenvector corresponding to 
the eigenvalue a;. This is the overlap rule. 


6a The state of the system immediately after a measurement is represented by 
the normalized eigenvector |a;) that corresponds to the eigenvalue a; that 
was obtained in the measurement. This collapse of the state vector, leading 
from the state on which the measurement is made to |a;), cannot be 
described by Schrédinger’s equation and is accompanied by an irreversible 
change in the measuring device. 


These principles can be extended to observables with continuous ranges of 
values: 


4b The possible measured values of any observable A are the set of numbers 
that are generalized eigenvalues of the corresponding quantum-mechanical 
operator, A. 


5b The overlap rule can be generalized to an observable with a continuous set 
of values, leading to Born’s rules for position and momentum. 


6b A fine-resolution measurement of an observable with a continuum of 
possible values causes the state vector to collapse onto a narrow 
wave packet centred on the value obtained in the measurement, with 
a width determined by the resolution of the measurement. 


143 


Chapter 5 The principles of quantum mechanics: a review 


144 


Time-development in the absence of measurement 


7 Provided that a system does not interact with a measuring device, its 
time-development is governed by Schrédinger’s equation 


d Z 
ih —|V) = H |Y 
ih |) = fi |v), 


where |W) represents the state of the system at time t, and fi is the 
Hamiltonian operator of the system. 


Exercise 5.7 Referring to Principle Ic, if |Y1}) and |W) are orthonormal, how 
must cı and cz be related? 


Exercise 5.8 (a) Consider a system in which the observable B has a discrete 
set of possible eigenvalues b;, each corresponding to a single eigenvector |b;). Use 
Principle 5a to find the probability that a measurement of B in a state represented 
by |b;) will yield the eigenvalue bj. 


(b) Write down a formula for the probability that a measurement of an 
observable A in the state |b;) will give a particular eigenvalue, aj, of A. Which 
named rule of Chapter 3 was an exemplar of this situation? a 


Achievements from Chapter 5 


After studying this chapter, you should be able to: 


5.1 Explain the meanings of the newly defined (emboldened) terms and 
symbols, and use them appropriately. 


5.2 Make appropriate use of the general bra-ket notation, and interpret it in the 
contexts of wave mechanics and spin space. 


5.3 Outline the basic principles of quantum mechanics, referring appropriately 
to states, identical particles, observables, measurements and 
time-development, and using language and notation appropriate for both 
wave mechanics and spin. 


5.4 Give an account of measurement in quantum mechanics. The account could 
include the probabilities of outcomes, the collapse of the state vector, the 
role of the measuring device and the need to compare quantum-mechanical 
predictions with repeated measurements carried out on identically-prepared 
states. 


5.5 Describe, in general terms, the difficulties presented by the continuum in 
quantum mechanics, and describe some ways in which these difficulties are 
overcome. 


Chapter 6 Quantum entanglement 
and the EPR argument 


Introduction: stranger and stranger 


‘I would call it [entanglement] not one but rather the characteristic trait of 
quantum mechanics, the one that enforces its entire departure from classical 
lines of thought.’ 


Erwin Schrödinger, 1935 


Those words are from a paper by Schrödinger written partly in response to a paper 
by Einstein (Figure 6.1), Podolsky and Rosen, that had been published shortly 
before. For many years, neither of these papers had much impact, as physicists 
around the world busily applied quantum mechanics, with enormous success, to 
understanding the structure of matter. But those papers were a time bomb ticking 
away in the foundations of physics. A handful of papers by Bohm in the 1950s, by 
Bell and others in the 1960s, and more recently a great stream of papers (see 
Figure 6.2), all look back to those once almost forgotten papers of the 1930s. 
From this work has sprung a new field of ‘quantum information’, which includes 
such topics as quantum cryptography, quantum teleportation and quantum 
computing. Figure 6.1 Albert Einstein 
(1879-1955), Nobel prize for 
physics in 1921. 
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Figure 6.2 The number of times that the paper by Einstein, Podolsky and 
Rosen was cited in other refereed journal articles for each year between its 
publication in 1935 until 2005. 


These developments have also taken us deeper into understanding the meaning of 
quantum mechanics. For example, when we say that a particle does not have a 
position until its position is measured, how do we justify such a statement? The 
concept of entanglement introduced by Schrédinger is one key to addressing such 
issues. In this chapter we shall introduce the concept of entanglement, and explain 


145 


Chapter 6 Quantum entanglement and the EPR argument 


how it has deeply affected our understanding of the world; the next chapter will 
describe some practical applications of entanglement. 


Entanglement necessarily involves states of two or more particles (the particles 
that are ‘entangled’), so this chapter will make essential use of results from 
Chapter 4 which explained how to describe spin states for more than one particle. 


The first section of this chapter briefly discusses the fundamental question of the 
existence (or non-existence) of hidden variables. Section 6.2 explains what 
entanglement is, with some examples of entangled states. Section 6.3 explains the 
basic principles behind experiments that might test for entanglement. These ‘in 
principle’ experiments are based upon the properties of the singlet state of two 
spin-5 particles. Most actual experiments that have been done involve photons, 
and two key experiments are briefly described in Section 6.4. Section 6.5 very 
briefly discusses the general significance of entanglement, looking forward to 
Chapter 7 where technological applications are described. 


6.1 Do hidden variables exist? 


Does a particle have a position or momentum before these quantities are 
measured? In standard quantum mechanics, the answer is ‘no’, but what are the 
grounds for this assertion? It seems very natural to think that the probabilistic 
nature of quantum mechanics is, after all, rather like that of classical statistical 
physics. In classical physics, particles do have positions and momenta and so, 
given a sufficiently powerful computer, we could calculate the position and 
momentum of every gas molecule in a balloon, instead of making do with 
macroscopic quantities such as pressure. Are there hidden away, beyond the 
purview of the quantum formalism as we know it, markers or variables that 
determine exactly when a particular uranium nucleus will decay, or exactly where 
the electron is located within a hydrogen atom? Do the measured values exist 
before they are measured? 


Here is the choice. One possibility is that quantum mechanics is a complete 
theory, giving a complete description of the state of any system. Since the 
formalism of quantum mechanics provides only a probability distribution for the 
position of a particle, we must then say that the particle does not have a position 
until a position measurement is made. The alternative possibility is that quantum 
mechanics — although highly successful — is an incomplete theory, giving 

only a partial description of the state of a system. In this case, the particle 

could have a definite position at all times, but this position is not part of the 
quantum-mechanical description. Any hypothetical variables that are supposed to 
determine the results of measurements with certainty, but are absent from the 
quantum-mechanical description, are called hidden variables. Many physicists 
(and this course) plump for the first alternative, asserting that quantum mechanics 
is a complete theory and that hidden variables do not exist. One aim of this 
chapter is to explain why this is so, in spite of the strange implications this has for 
the nature of reality. 


Einstein did not accept the standard quantum-mechanical view, famously 
declaring that “God does not play dice’. Already in 1927, he argued that quantum 
mechanics must be incomplete. He based his argument on the phenomenon 
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of electron diffraction. We said in Book 1 that, when an electron passes 
through a tiny slit and is subsequently detected at some point on the screen, it 
instantaneously seems to ‘know’ not to be detected at any other point. This is an 
example of a non-local effect, meaning that what happens at some point appears 
to be affected by what happens at other points that are too far away for any 
communication of information, even at the speed of light. Einstein could not 
accept such non-locality, preferring to believe that the electron does have a 
position before a spot forms on the screen, and that quantum mechanics is 
incomplete since it does not include a representation of that ‘hidden’ position 
prior to measurement. 


However, this argument is not at all conclusive. Supporters of quantum 
mechanics, such as Bohr and Heisenberg, could simply assert that non-local 
effects do occur in quantum mechanics, whether we like it or not. The question 
simply cannot be decided on the basis of the behaviour of a single particle. 

The paper by Einstein, Podolsky and Rosen (EPR) of 1935 provided a sharper 
argument for Einstein’s point of view, based on the behaviour of two particles 
rather than one. It was Schrödinger who, in the article quoted in the Introduction, 
referred to the state of these two particles as being entangled. In 1952, David 
Bohm recast the EPR argument in a form that is easier for us to explain, and we 
prepare the way for Bohm’s version of the EPR argument in Section 6.2. 


It is interesting to note that John von Neumann (in 1932) claimed to prove that 
any theory based on hidden variables would be unable to reproduce the results of 
quantum mechanics. Since quantum mechanics is undeniably successful, this 

led many physicists to suppose that theories based on hidden variables would 
never be viable. However, it turned out that von Neumann’s proof contained an 
oversight, and that it did not establish anything conclusive about hidden variables. 
In the 1950’s David Bohm demonstrated that Schrédinger’s equation for a single 
particle can be reinterpreted in terms of hidden variables. However Bohm’s theory 
has features that lead most physicists to reject it; the price paid for particles 
having definite values of observables prior to measurement is such an extreme 
form of non-locality that the medicine seems worse than the cure. Bohm’s theory 
even failed to enthuse Einstein, who had been hoping that hidden variables would 
one day be accepted, but it did open up the whole issue of hidden-variable 
alternatives to quantum mechanics. 


Two more recent developments have transformed the subject. Firstly, John Bell 
showed that it is possible, in principle, to devise experiments that distinguish 
between quantum mechanics and the most plausible class of hidden-variable 
theory. Secondly, advances in experimental techniques, particularly with tunable 
lasers, have allowed some of these experiments to be carried out. But, before we 
can say how entanglement allows experimental tests of hidden variables, we must 
first explain entanglement itself. 


6.2 Entanglement and spooky action at a distance 


Consider the following statement: A measurement of the properties of a particle 
can have an instantaneous effect on a measurement of properties of a second 
particle located indefinitely far from the first. How could this possibly be? 
Einstein, in a letter to Max Born in 1947, referred to such ‘spooky action at a 


147 


Chapter 6 Quantum entanglement and the EPR argument 


distance’ (‘spukhafte Fernwirkungen’ in German) as being contrary to our 
deep-rooted understanding of what it means for a body to have an individual 
existence. Surely what happens ‘here’ cannot be influenced by what happens 
‘there’, when ‘there’ is too remote for light to travel ‘here’ in the timescale of the 
experiments. But, remarkably, that is just what happens; such non-locality is a 
fact of Nature. 


The property of a pair (or more) of particles that makes this “spookiness’ possible, 
is entanglement. Two or more entangled particles may show non-local effects. 


Most of the experiments that have actually demonstrated these non-local effects 
involve photons. However, much of the theoretical literature involves pairs of 
spin-5 particles; entanglement is most naturally introduced in terms of such pairs. 
Our first example of entanglement will therefore be in terms of entangled spin-4 
particles, and we shall move on to more general cases later, especially entangled 


photons. 


6.2.1 Entangled states of two spin-Ż particles 


Bohm’s version of the EPR argument is based on the behaviour of the singlet state 
of two identical spin-5 particles. From Chapter 4, we recall that the singlet state 


The last line of Equation 6.1 of two spin-5 particles may be written 

uses the positional notation, 1 

in which the first arrow is |S = 0, Ms = 0) = — (| Mal Da = I dhl Da) 

understood to apply to the A 

particle labelled ‘1’ in the = —(|T1)—] |1)). (6.1) 
first line. In this connection, V2 ( ) 


‘particle 1° simply means ‘the In this expression, |S = 0, Ms = 0) represents the spin state of a two-particle 
particle detected in detector 1’. sytem with total spin S = 0 and Mg = 0. The fact that a state with S = 0 
can only have Ms = 0 is, of course, the reason for the term ‘singlet state’. 
| T); represents particle 1 in a state with ms = +4, and | |), represents particle 2 
in a state with m, = — 5. The overall factor Z normalizes the state to unity. 


Finally, note that | Tand | |), without any directional indication on the arrows, 
Recall that the component of refer to the spin-up and spin-down states in the z-direction. 


spin in the z-direction is meh. Now let us imagine that particles 1 and 2 have been prepared in such a singlet 


state while they are close together (this is perfectly reasonable), and that they then 
become separated. It doesn’t really matter how far they separate, but let’s say they 
are very far apart when particle 1 interacts with a detector (which you can think 
of as a Stern—Gerlach analyzer) that can measure particle 1 to be spin-up or 
spin-down in the z-direction. One or other of those results must be obtained; let 
us say particle 1 is found to be spin-up, i.e. with Ms = +3. That means that the 
state vector collapses onto the first part of the singlet state in Equation 6.1 to 
become a new state 


|collapsed) = | Tl l)a =|11), (6.2) 


which no longer requires the normalizing factor F The two particles are no 


longer in a singlet state, and, according to quantum mechanics, we believe that a 
measurement on particle 2, whether in Milton Keynes, Timbuktu or somewhere in 
the Andromeda galaxy, would reveal it to be spin-down, with Mms = — 5. 
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On the other hand ... if the measurement on particle 1 gave Mms = —5, then a 
measurement on particle 2 would reveal it to have Mms = +4, because the first 
measurement would have ‘collapsed the state vector’ onto the second term in 

Equation 6.1. 


@ Verify that the state represented by |collapsed) in Equation 6.2 is normalized 
to unity. 


O We need to show that (collapsed|collapsed) = 1. But this quantity is 
(T) | T1). In such expressions, the bra and ket for particle 1 go together, as do 
the bra and ket for particle 2, giving (departing briefly from positional 
notation) (f | 1) x Q | J} =1 x 1=1 since each state | T), and | |), is 
normalized. 


In neither case can we predict what the results of a measurement on particle 1 
would be. It’s like tossing a perfectly balanced coin: over a sufficiently long 
interval there would be 50% heads up and 50% heads down. But, once m, for 
particle 1 has been measured, we know that the result of a measurement of the 
spin of particle 2 will yield the opposite. And, of course, it works both ways: the 
result ‘there’ whatever it was, would determine the result ‘here’ just as much as a 
measurement ‘here’ determines the result ‘there’. Schrödinger referred to the 
particles as being subject to “Verschrankung’ (literally ‘folding’ or ‘crossing 
over’), which is now translated into English as entanglement. A key point is that 
the results of the measurement ‘there’ are not directly communicated to the 


measurement ‘here’ by any possible signalling device; one measurement could We stress the distance between 
be in the Andromeda galaxy and the other in Milton Keynes. Here indeed is the measurements in order to 
Einstein’s “spooky action at a distance’ and with this, ‘spooky’ entered the preclude the possibility of 
vocabulary of physics. information from one 


measurement reaching the other 
measurement at the speed of 
light, the maximum speed at 
which information can travel. 


At first sight, there is a simple classical description which avoids this spookiness. 
If we assume that the particles have opposite spin components in the z-direction 
when they are released, and that these values are definite even although we do not 
know them, it is not surprising that a measurement of spin-up for one particle will 
be accompanied by a measurement of spin-down for the other particle (and vice 
versa). However, you will soon see that this common-sense classical view itself 
runs into difficulties. Moreover, it is not the quantum-mechanical view, which 
denies that particles have definite spin components prior to measurement. 


According to quantum mechanics, everything we know about the spins of the 
particles is encapsulated in the state vector given in Equation 6.1. Yet, as Einstein 
emphasized, the description seems incomplete: the particle ‘there’ may be too far 
away to know that a measurement ‘here’ has robbed it of the possibility enshrined 
in Equation 6.1 of being either spin-up or spin-down. This is similar to the 
diffraction of a single electron: the appearance of an electron at one point on the 
screen instantaneously makes its appearance at any other point impossible. In 
the spin-singlet case, if we measure one particle’s spin ‘here’, we know what 

the result will be for the other particle’s spin ‘there’; the measurement ‘here’ 
instantaneously determines the result of the measurement ‘there’. This is a 
manifestation of non-locality. 


The mention of instantaneous effects over great distances naturally arouses the 
suspicions of anyone who is familiar with special relativity. In fact, it can be 
shown that such collapses cannot be used for transmitting information (the kind 
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that sells newspapers) faster than light, so the predictions of special relativity 
are obeyed. Nevertheless, the result is extraordinary and underlies all of the 
phenomena connected with entanglement. 


6.2.2 When is a state entangled? 


Entangled states 


A wave function or state vector representing the state of two particles is said 
to represent an entangled state if it cannot be expressed as a product of 
terms each specifying the state of a single particle. Entanglement does not 
depend upon the basis used to describe the state. 


The singlet state given in Equation 6.1 represents an entangled state of two spin-5 
particles, but entanglement is a general property that need not necessarily involve 
spin. 

Product states exist for non-identical particles, so the spatial eigenfunction of two 
particles with coordinates rı and ro, 


%(r1,r2) = yı (rı)yp2(r2), (6.3) 
is certainly not entangled. However, the spatially antisymmetric eigenfunction 
1 
pa(rı, r2) = a vlr) — pa(rı)yı(r2)] (6.4) 


cannot be written as such a product, and this does represent an entangled state. 


Two non-entangled states involving spin are the S = 1, Ms = +1 triplet states of 
two spin-5 particles that we write as 


|S = 1, Mg = 1) =| 1)1| Te =| 17) 
and 
|S =1,Ms = -1) = | pal Lo =| LL). 


However, one of the triplet states, that with Ms = 0, is entangled. This entangled 
state is like Equation 6.1 but with a plus sign: 


1 
all 1) +110). (63) 


Equations 6.1 and 6.5 have a common feature: 


|S = 1, Ms = 0) = 


A characteristic of entangled states is that the members of a system of 
entangled particles do not each have their own quantum states although the 
system as a whole does. 


For example, in the case of the two spin-4 particles in a singlet state 

(Equation 6.1) there is no way to predict the outcome of a spin measurement on a 
single particle although the pair can be assigned quantum numbers S' = 0 and 
Ms = 0 that determine the magnitude and z-component of the total spin. 
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We have said that entanglement does not depend on the choice of basis. As an 
example, consider the state: 


|A) = $(| Te) ets a | Ta) | le)o g | Lan) | las at | Ezra Taia) (6.6) 


where | fs) and | |.) are spin-up and spin-down states in the x-direction. 
Although it is not be immediately obvious, this can be factored as follows: 


= z Te) = | de) X z Teda — dele) (6.7) in this equation to check that it 
is equivalent to Equation 6.6. 

This is a product of terms specifying the state of each particle, so |A) is not 

entangled. We have seen this in the basis of | T) and | |z), but even if we change 

to a different basis, |A) will still be a product of terms specifying the states of 

particle 1 and particle 2, and so will not be entangled in the new basis. 


You can multiply out the terms 
|A) 


For example, we know from Chapter 3 that the spin-up and spin-down states in 
the x-direction can be written as 
1 1 
| Te) = Weil Tz) +| i) and | lx) = yan | Tz) + | la) 


Combining these results, it is easy to see that Equation 6.7 can also be written as 


|A) = | Tz) | te 


Non-entanglement is immediately obvious in the basis of | },) = | 1) and 
| 12) =| |), but the property of being entangled, or not entangled, has nothing to 
do with the choice of basis. 


6.2.3 The singlet state from another angle 


One property of the singlet state makes it particularly interesting from the point of 
view of entanglement. The singlet state |S = 0, Ms = 0) looks the same from all 
directions. To explain what this means, we recall that a single-particle state 
corresponding to spin-up in the z-direction is denoted by | +), which is written in 
: . We also know from Chapter 3 that a general spin state 
which is spin-up in the direction of unit vector n, defined by the spherical 
coordinate angles 0 and ¢, can be written in spinor and ket form as 


Ha) = [sey] = 2088/2011) Fel sin(9/)1), 6D 


with a similar equation for spin-down in the n-direction: 


a Pa 


As always, kets with arrows without subscripts, as in | J) and | |), signify states 
that are spin-up or spin-down with respect to the z-axis. Using Equations 6.8 
and 6.9, it is not hard to show (see Worked Example 6.1 below) that, for any 
direction n, 
1 
V2 


spinor notation as 


= —e™}® sin(0/2) | 1) + cos(0/2) | 1). (6.9) 


(| Tala a | Intar) = al Tals) = | L-T) 
1 
v2 


(| tL) -—[ 11). (6.10) 
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Essential skill Worked Example 6.1 
Manipulating entangled spin Use Equations 6.8 and 6.9 to verify Equation 6.10. 
states 

Solution 


Substituting the expressions for | Tn) and | |n) given in Equations 6.8 
and 6.9 into the left-hand side of Equation 6.10, we obtain 


gll tala) — | lata)) = (| fa | ln) = | n)a | n)a) 
a + |(cos(#/2)| 1), +e! sin(8/2)| 1)1) 
x (-e7* sin(6/2) | T)s + cos(8/2) | 1)a)| 
z — [(-e-# sin(0/2) | 1), + cos(@/2) | 1)) 


x (cos(6/2)| 1) +e! sin(6/2)| 1)s)]. 


Collecting terms and multiplying out, noting that the terms involving 


Remember cos? in? 2 = 1. 
eee | Dal t)z and | |),| |). have cancelling coefficients, this gives 


=al Tala) ~ | Lata) = [(cos*(0/2) + sin®(0/2)) | Dal Yo 


— (cos?(6/2) + sin?(6/2)) | 1),| al 
1 
v2 
1 
v2 


which is the right-hand side of Equation 6.10, as required. 


(I Thal Lo — | Lal Te) 


(| 11) —| 11), 


Reminder: with the positional 
notation, the first symbol refers 
to particle 1 and the second 
refers to particle 2. 
What this worked example tells us is that the singlet state |S = 0, Ms = 0) takes 
a similar form in any basis. Let us consider what this implies. Suppose that a 
singlet state of a pair of spin-4 particles, |S = 0, Ms = 0), is produced, and then 
a measurement of the spin of one of them is made with a Stern—Gerlach apparatus 
with its magnet oriented in an arbitrary direction n. We have now seen that the 
singlet state can be represented by 


zl Tn) 1 {n)o = | tn) 1 TaJ): 


The result of such a measurement at this new angle must be spin-up or spin-down 
in the direction of n. Then we know with certainty what the result will be of a 
spin measurement of the other particle, also along the direction of n, whatever the 
direction of n. If the first measurement gave spin-up for particle 1, then the 
singlet state must have collapsed onto the first term, | Tn),| Ln)2. It then follows 
that particle 2 would certainly be measured to be spin-down, i.e. having spin 


|singlet) = 
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component -ih in the n-direction. Even if the second measurement were light 
years distant, the measured spin would be the opposite of what was found for the 
first particle. 


This is an extraordinary result. As Niels Bohr said, ‘If you are not shocked by 
quantum mechanics, you don’t understand it’, and here is the touchstone of 
shocking! Consider: it is natural to think, as Einstein did to the end of his 

life, that a measurement of the spin of a nearby particle with a Stern—Gerlach 
analyzer, oriented as you wish, could in no way influence the outcome of a spin 
measurement on a distant particle. Moreover, just as one measurement in the 
x-direction tells you what the result of another, distant, measurement in the 
x-direction must yield, the same is true of z-direction measurements. So are both 
x- and z-components of spin simultaneously determined even before they are 
measured? Not according to quantum mechanics! We know that S, and S, do not 
commute with one another, so the observables Sy and S,, are incompatible. They 
do not both have definite values in the same state. If we measure Sy for the first 
particle, the second particle collapses into a state with a definite value of Sy; if we 
measure Sy for the first particle, the second particle collapses into a state with a 
definite value of S,. The spooky influence of the first measurement on the state of 
the second particle is quite unnerving. We shall return to this discussion in 
Section 6.3.2. 


Exercise 6.1 A pair of spin-4 particles is created in the singlet state 

|S = 0, Ms = 0). The first particle is measured to have spin +3h in the direction 
n defined by 0 = 60° and ¢ = 0°. What is the probability that a subsequent spin 
measurement on the second particle, with the detector oriented along the z-axis, 
will yield +5h? a 


6.2.4 Many phenomena involve entanglement 


Entanglement is not just a property of singlet states of two electrons. It is involved 
in a huge range of phenomena; we describe two below. 


i i half-silvered 
The optical beam splitter Eee 


mirror 
An example of entanglement is provided by the optical beam Xe D 


splitter (implemented as a half-silvered mirror) of Book 1, À , 
Chapter 1. Such a beam splitter was at the heart of the pease tae 


photons 

quantum random number generator and also the Mach-Zehnder aie 
: : A i photon detector 
interferometer, in which there were two beam splitters. In that 
chapter we made the rather cryptic statement: ‘... each photon 
in some sense actually goes both straight through and is reflected 
at 90° ...”. The language of entanglement allows us to be a SS 
little clearer on this matter. In Figure 6.3 we show a beam splitter 

. ; : photon detector 
with detectors that can register the arrival of photons that are 
transmitted or reflected. We assume that the beam splitter is ideal 
in the sense that equal numbers of photons are detected in each Figure 6.3 Half of the photons arriving at the 
detector. Notice that we do not say ‘equal numbers of photons beam splitter from the left are detected at the 
travel each route through the beam splitter’. In classical physics, ‘transmitted straight through’ detector, and half 


we would know what such a statement meant, but this is not soin êt the ‘reflected downwards’ detector. 
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In this context, |0} refers to a 
state with no photons. It might 
seem surprising to discuss states 
of no particles, but this is done 
in the quantum theory of fields. 
The |0} state for photons can 

be thought of as the state 

of zero-point motion of the 
electromagnetic field, analogous 
to the ground state |n = 0) of a 
harmonic oscillator. 
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quantum physics. However, we do have a clear idea of what it means to say that 
“equal numbers of photons are detected in each detector.’ We simply ask how 
many counts have been registered electronically. 


In order to clarify what is going on in the beam splitter, we must introduce some 
notation. We denote a state with a single photon in it as |1), and a state with zero 
photons as |0). We shall also use subscripts T and R, respectively, to indicate 
transmitted and reflected photon states. We assert, without proof, that after a 
photon falls on the beam splitter, the state of the system can be written as 


|BS) = a|0)g|1)p + b|1)R|0) 7, 


where the complex coefficients a and b satisfy |a|? + |b|? = 1 (to ensure 
normalization) and |a|? = |b|? (since equal numbers of photons are detected 
in each detector). We cannot say that a = b = 1/,/2 because reflected and 
transmitted photons get different phases ¢g and @r, giving a = e'?8 /\/2 and 
b = e'?r /,\/2; these phases have no effect on our argument here. 


(6.11) 


Equation 6.11 represents an entangled state. Neither the transmitted nor reflected 
channels of the beam splitter can be said to have one photon, neither can be said 
to have no photons, and there is definitely no such thing as half a photon. If the 
detector for transmitted photons fires, then the state vector |BS) instantly loses the 
term |1),|0)- that represents a photon in the ‘reflected’ arm. Likewise, if the 
detector for reflected photons fires, the state vector |BS) instantly loses its 
|0)|1)-p term that represents a photon in the ‘transmitted’ arm. The behaviour of 
Mach-Zehnder interferometers discussed in Chapter 1 of Book 1 can be explained 
on the basis of these ideas. 


Incidentally, Equation 6.11 allows us to illustrate what is meant by partial 
entanglement. If |a|? = |b|? = 4, then the state is said to be fully entangled. If 
either |a|? = 0 or |b|? = 0, then the state is clearly not entangled. There is nothing 
to prevent intermediate cases, which are described as being partially entangled. 
In fact one has to have a very good beam splitter to make a fully entangled state. 


All the extravagant examples presented in an earlier subsection for singlet states 
of spin-5 particles (one detector in Milton Keynes, one in Timbuktu, etc.) apply to 
the beam splitter. If the paths from the beam splitter could somehow be extended 
to these cities, then we could be sure that the detection of state |1) in Milton 
Keynes would instantaneously mean that the state in Timbuktu was |0), so no 
photon would be detected there. 


Our picture of the output from a beam splitter as an entangled state is the 

reason for the vagueness in our account of the passage of photons through a 
Mach-Zehnder interferometer. None of the descriptions that would be appropriate 
for a macroscopic object apply: we cannot say a photon goes via one path, nor can 
we Say that it goes the other way. Instead, the photon has complex probability 
amplitudes a and b for being detected on either path, with corresponding 
probabilities |a|? and |bļ?. 


We have now given a mathematical description of a situation that might, with 

due circumspection, be described as the photon going both ways through a 
Mach-Zehnder interferometer. Finally, we remark that what we have said here for 
photons has also been verified experimentally for material particles such as 
neutrons, for which there are analogues of Mach—Zehnder interferometers. 


6.3 Quantifying the weirdness: testing for hidden variables 


Entanglement in a-decay 


Let us suppose that we have an isolated nucleus of 73°U in the far reaches of 
empty space. Let us also say that it emits an a-particle by means of the tunnelling 
process described in Book 1. It turns out that a-decay generally takes place with 
the emission of an a-particle having an equal probability of being emitted in all 
directions. That is a deceptive statement; what it really means is that the a-particle 
will have an equal probability of being detected coming out in all directions. The 
a-particle does not have a direction until it has been detected coming out at some 
direction, any more than a particle described by Y (x,t) is ‘at’ x until it has been 
measured to be at x. But now for a complicating factor: momentum is conserved. 
In its rest frame, the original 238U nucleus has zero momentum, so the total 
momentum of the products, an a-particle and the ?34Th daughter nucleus, must 
also be zero, and this is achieved by an appropriate recoil of the daughter nucleus. 
But the a-particle does not have a direction until it is detected, and so it does not 
have a momentum (a vector quantity) until it has been detected. So how does the 
daughter nucleus ‘know’ along which direction to recoil? 


The resolution of this question is as follows: the decay process leaves the 
a-particle and the ?34Th daughter nucleus in an entangled state. The wave 
function contains all directions for the a-particle and all directions for the 
recoiling daughter nucleus. If the a-particle were to encounter a tiny piece of 
space dust, or even a planet, this would effectively measure its direction, i.e. its 
momentum vector. The daughter nucleus would at that instant, no matter how far 
away, assume a direction of motion such that its momentum vector would add to 
that of the a-particle to give zero. Of course, if it were the nucleus that was first to 
encounter something like a spaceship or a star, then it would be the a-particle that 
would, as a consequence, instantly acquire a corresponding well-determined 
momentum. The only limit to how far away the two particles would be before the 
detection of one would determine the momentum of the other would be the simple 
matter of how free of other particles the space surrounding the decaying nucleus 
is. For a uranium nucleus decaying in a lump of iron ore, the ore itself would 
probably very quickly register the recoiling nucleus, so that the a-particle would 
very quickly attain a well-defined direction. 


6.3 Quantifying the weirdness: testing for 
hidden variables 


The entangled states of the last section have very surprising properties, but we do 
not yet have cast-iron evidence for quantum non-locality and against hidden 
variables. We shall approach this crucial issue in two steps. 


Step 1 In this section, we describe a hypothetical experiment for a pair of spin-5 
particles and derive the predictions of quantum mechanics for measurements 

of the spins of the two particles, exploiting what we know for singlet states. 

You will see that any hidden-variable theory (in a wide class known as local 
hidden-variable theories) leads to results that differ from those of quantum 
mechanics. 


Step 2 In Section 6.4 we describe real experiments that decide between the 
predictions of local hidden-variable theories and the predictions of quantum 
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Figure 6.4 (a) A schematic 
representation of an arrangement 
for spin measurements on a 
singlet state by Stern—Gerlach 
analyzers SG1 and SG2 

(SG stands for Stern—Gerlach). 
The y-axis runs in the direction 
from the source to SG2. (b) This 
diagram shows the angle 0 from 
a perspective of looking towards 
the central source of particles 
from SG2 (with the orientation 
of the pole pieces indicated). 


SE 


entangled particles 


0 for both detectors is always 
measured down from the 
positive z-axis towards the 
positive x-axis. It is a special 
case of the spherical polar 
coordinate 0, with ¢ = 0. 
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mechanics. Most of the experiments have been done using photons rather than 
spin-5 particles, so Section 6.4 begins with an account of relevant aspects of the 
quantum theory of polarized photons and then describes the experiments. 


6.3.1 Bohm’s hypothetical experiment 


It is helpful to have a definite experimental arrangement in mind when we 
consider the results of measurements on an entangled spin state. The particular 
arrangement described here was put forward by David Bohm in the early 1950s; 
the underlying physics is that of the experiment proposed by EPR in 1935. 
Figure 6.4 shows the arrangement. 


T 
; a 
S 
NR 
` 
@ SG2 : 
source of oni 


towards source 


(b) 


Pairs of particles are created in a singlet spin state at the source, and are directed 
to the left and to the right where their spins are measured by Stern—Gerlach 
analyzers. The analyzer on the left is SG1, and that on the right is SG2. We 
shall refer to the particle detected in SG1 as particle 1, and the state describing 
particle 1 in a spin-up state as | }),; and similarly for particle 2. However, it is 
often awkward to include the particle subscript, particularly with bras, so we 
generally use the positional notation. Instead of | Tn), | |n). we write simply 

| Tn Jn). The corresponding bra would be (În Jn |. The first symbol always 
refers to the first particle (here, that detected on the left) and the second symbol 
refers to the second particle. Thus, (ab| = (al, (b|, is the bra corresponding to 
lab) = |a) lb). 

Concerning the detectors in Figure 6.4, they always measure the spin component 
in the xz-plane, i.e. normal to the direction of the particle motion, which is along 
the y-axis. Vector n, defining the direction in which the spin component is 
measured up or down, is at angle 0 to the z-axis and (90° — 0) to the x-axis. 
Because both measurements are in the xz-plane, both ¢ angles are zero, though 
we shall leave them in the equations in the first steps before setting them to zero. 


6.3.2 Bohm’s experiment with parallel analyzers 


Section 6.2.1 already tells us what to expect if detectors SG1 and SG2 are aligned 
in the z-direction. If particle 1 is found to be spin-up, then particle 2 will be found 


6.3 


to be spin-down, and vice versa. The same would be true if SG1 and SG2 are both 


aligned in the direction of the unit vector n: whatever n is chosen, if particle 1 is 
found to be spin-up, then particle 2 will be found to be spin-down, and vice versa. 


Even these results, with analyzers SG1 and SG2 oriented at the same angle, are 


deeply perplexing. Let us say that the orientation of SG1 is chosen randomly to be 


in either the x-direction or the z-direction when a pair of spin-5 particles in a 
singlet state leave the source to be detected at SG1 and SG2. Suppose the spin 
measurement at SG2 is made before there was time for any signal at the speed of 
light to reach SG2 from the measurement at SG1. The orientation of SG2 is also 
independently chosen from the x-direction or z-direction. We now consider only 
the cases where SG2 happens to be oriented along the same direction as SG1. 
Once the measurement at SG1 has been made, the results are entirely predictable: 
spin-up in the z-direction at SG1 implies spin-down in the x-direction at SG2, 
and vice versa; it is the same if both are oriented along the z-direction. 


In quantum mechanics, the explanation is straightforward. The two particles are 


initially in an entangled singlet state, which can be represented in any orthonormal 


basis. If particle 1 is found to be spin-up in the x-direction, we can represent the 
singlet state as (| Telz) — | Letx))/W2, and say that it collapses to | Telz), a 
state in which particle 2 is spin-down in the x-direction. If particle 1 is found 
to be spin-up in the z-direction, we can represent the same singlet state as 

(| Tele) —| Letz))/V2, and say that it collapses to | {, |), a state in which a 
measurement of S, for particle 2 is certain to give spin-down. 


Einstein, Podolsky and Rosen, in their consideration of an analogous experiment, 
took a very different view. They noted that it is possible to predict either the value 
of Sz or the value of S, for particle 2 (depending on the measurements made, and 
the results obtained, for particle 1). Bearing in mind that the large separation 
between the particles should preclude any influence on the measurement at SG2 
by the measurement at SG1, they claimed that the most reasonable interpretation 
is to say that particle 2 actually has definite values of both Sy and S,, presumably 
determined ever since it split up from particle 1. This is the EPR argument. Of 
course, the conclusion is inconsistent with quantum mechanics, but to Einstein 
this simply meant that quantum mechanics was incomplete; nowadays, we would 
interpret ‘incomplete’ as ‘being in need of hidden variables’. 


Again, we are at an impasse. Standard quantum mechanics says one thing, and 
hidden-variable theories say something else, but there is nothing other than 
prejudice or taste to decide between the two descriptions. To distinguish between 
hidden-variable theories and quantum mechanics, we must consider experiments 
with non-parallel detectors. That is what we shall do next. 


6.3.3 Quantum correlations for the singlet state 


Before analyzing Bohm’s hypothetical experiment for non-parallel detectors, let 
us review some basic results obtained in Chapter 3. Let us suppose that a spin-5 


particle is in the spin state | |) = | 1), and that its spin component is measured in 


the n-direction, defined by the angles @ and ¢ of spherical coordinates. For 
simplicity, we take ¢ = 0 so, if the particle is travelling along the y-axis, the spin 
measurement can be made with a Stern—Gerlach analyzer with its orientation 
vector in the xz-plane, at an angle @ to the z-axis. 


Quantifying the weirdness: testing for hidden variables 


Here ‘preclude any influence’ 
refers to the fact that the spatial 
separation of SG1 and SG2 is 
greater than the speed of light 
times the time interval between 
the measurements at SG1 and 
SG2 in the rest frame of the 
detectors. In the background is 
the firm prediction of special 
relativity that information 
cannot travel faster than light. 
Indeed, it has been shown that 
the non-local correlations 
described here do not break this 
fundamental rule. 


157 


Chapter 6 Quantum entanglement and the EPR argument 


What is the probability of measuring spin-up in the n-direction in this state? We 
remind you of the result. The probability amplitude for this outcome is (În | T) 
where, from Equation 6.8, 


(Tn | = cos(9/2) (f | + sin(9/2) (J |. (6.12) 


(Going from ket to bra requires taking the complex conjugates of all coefficients, 
but they are real here since ¢ = 0.) It follows that the probability for the particle 
to be measured as being spin-up in the direction n defined by angle 0 is 


PG) =al 


where 


(Tn | T) = cos(9/2) (T | T) + sin(@/2) (J | 1) 
= cos(8/2), 


recalling from Chapter 3 that (f | 1) = 1 and (| | 1) = 0, from the orthonormality 
of the spinors. Hence 


This is the cos?(@/2) rule. P(Tn) = cos?(6/2). (6.13) 


Similarly, the probability of a spin-5 particle in the spin state | },) = | 1) being 
measured to have spin component —sh in the n-direction is 


Pla) = |a | t)|? =sin2(6/2). 


Exercise 6.2 Show that P(|,) = sin?(0/2). a 


With that brief review, we are ready to consider spin measurements on both 
particles of an entangled singlet state. We have in mind that the entangled state is 
created at some central location and the two particles fly off in opposite directions 
as shown in Figure 6.4. Some distance in each direction are Stern—Gerlach 
analyzers SG1 and SG2 oriented at different angles. Let us say that SG1 on the 
left is oriented so that spin-up is along the z-axis, i.e. at 0; = ¢, = 0. SG2 on the 
right is rotated in the xz-plane through some angle 0, with its orientation defined 

We refer to the angle of SG2 as by a unit vector n with spherical coordinate angles 62 = 0 and ¢2 = 0. 

0 rather than 65 to unclutter 


. We ask: what is the probability that both particles of an entangled pair will be 
many equations. 


measured to be spin-up in their respective detectors? 


To answer this question, we must calculate the probability of detecting the 
particles in the state represented by | 1 tn), for which the bra is (f tn |. The 


probability we seek is (tT Ta |S = 0, Ms = 0) |’, where 
1 


S =0, Ms =0) = = Eqn 6.1 
| s=0) = FA(1 1) 111) (Eqn 6.1) 
represents the singlet state. The corresponding probability amplitude is 

1 


(T Tn |S = 0, Ms = 0) = —s((T Tal T 1) - (1 ta 11) 


Sl 


2 
1 


(TID ll) -— 1 Ltn | 1))- 619 


I 


2 
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Notice that the first (left-hand) entry in the bra, ({ tn | goes with the first 
(left-hand) entry in each of the kets in Equation 6.1, and the second entry in the 
bra goes with the second entries in each of the kets. The first and third bra-kets of 
Equation 6.14 are easy: (} | 1) = 1 and (7 | |) = 0. We therefore have 


(ta |S = 0, Ms = 0) = a(t |) 
1 
= p a (T | 1) + sin(0/2) (1 | 1)) 
= <5 (e(0/2) x 0 + sin(0/2) x 1). 
Hence the probability amplitude that we seek is 
(t tn |S = 0, Mg =0) = = sin(6/2). (6.15) 


It follows that the probability that both spin measurements, in SG1 and SG2, will 
result in a spin-up measurement is 


probability (up, up) = 4 sin?(0/2). (6.16) 


The probability of finding both particles to be spin-down is also 5 sin? (8/2), as is 
most easily seen by remembering that the singlet state is the same from all angles 
(Section 6.2.3), so ‘both up’ and ‘both down’ are essentially the same to a singlet 
state. 


What now is the probability of finding one of an entangled pair to be spin-up in 
SG1 and the other spin-down in SG2? 


Exercise 6.3 For this situation, fill in the spaces containing asterisks in the 
expression for the required probability amplitude: (x * | * *). a 


Evaluating (Î |n |S = 0, Ms = 0), using Equation 6.1, we obtain 
1 
n |S =0, Ms =0) = (Tt In |- = 
(Tla | s=0)=(1] ae | 11)) 
a IDa 11) = TL) | 1)) 


? 


aik | 1). 


Í 
al- 


From Equation 6.9, (|n | = = —sin(0/2)(1 | + cos(@/2)(| | and so 
(TIn |S = 0, Ms = 0) = a cos(0/2). 


Hence the probability of finding particle 1 to be spin-up and particle 2 to be 
spin-down is 

probability (up, down) = 4 cos? (8/2). (6.17) 
Again, because of the symmetry of the singlet state, the probability of spin-up in 


SG1 and spin-down in SG2 is the same as spin-up in SG2 and spin-down in SG1, 
namely 4 cos?(/2). 
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Exercise 6.4 Verify that this last result is consistent with the basic property of 
the singlet state that the particles are certain to be found with opposite spin 
components along any given direction. E 


6.3.4 Quantifying the correlations 


There is a specific quantity that characterizes the correlations between 
measurements in SG1 and SG2 that turns out to be the key for testing for local 
hidden variables. It is the correlation function C (0), defined as follows for the 
arrangement with SGI aligned along the z-axis and SG2 at angle 0: 


C(0) = + the probability of measuring both particles up 

+ the probability of measuring both particles down 

— the probability of measuring particle 1 up and particle 2 down 

— the probability of measuring particle 2 up and particle 1 down. 
In brief, C(@) is ‘the probability of getting them the same’ minus ‘the probability 
of getting them different’. A positive value of C(0) indicates a tendency for the 
spin measurements to be the same; a negative value indicates a tendency for the 
spin measurements to be opposite. 


The first two terms in the correlation function are each 5 sin? (0/2), from 


Equation 6.16, and the second two terms are each 5 cos?(0/2), from 
Equation 6.17. Hence, using these results, 


C(@) = sin? (0/2) — cos?(@/2) = — cos 8. (6.18) 


We need one more step — to simplify the calculations, we took SGI to be aligned 
along the z-axis and SG2 to be at an angle 0 to it. But the singlet states are 
independent of basis, and we don’t ‘know’ which is the z-direction. All that really 
matters for the measured probability is the angle between SG1 and SG2. So let us 
say that they are respectively at 0, and 6% to the z-axis (keeping ¢; = $2 = 0). 
Then we can finally write 


C(O, = 02) = cos(01 = 62). (6.19) 
Essential skill Worked Example 6.2 
Interpreting probabilities Measurements on 10000 pairs of spin-5 particles in singlet spin states are 


made by a pair of Stern—Gerlach analyzers oriented at 30° and 90° to the 
vertical. What is the most likely number of pairs where both particles are 
found to be spin-up? What is the most likely number of pairs where one 
particle is found to be spin-up and the other spin-down? Why do we ask for 
‘the most likely number’ rather than ‘the number’? 


Solution 

In this case 6; — 02 = 60°. The probability of both particles being measured 
to be spin-up is 5 sin?(30°) = 1/8, so the most likely number of such pairs 
is 10000/8 = 1250. 

There are two ways for one particle to be found to be spin-up and 


the other spin-down, so the most likely number of such pairs is 
2 x $ cos?(30°) x 10000 = 7500. 
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Quantum-mechanical predictions are statistical, and just as the measured 
value of an observable only closely approaches the expectation value as the 
number of measurements becomes large, so the actual number of times an 
outcome occurs approaches the predicted number only as the total number 
of measurements becomes large. Hence we ask for ‘the most likely number’. 


Exercise 6.5 Many singlet pairs of spin-5 particles are observed with SG1 and 
SG2 oriented at 0° and 90° to the vertical, respectively. What are the predicted 
relative proportions of up-up, up-down, down-up and down-down results? m 


Equation 6.19 is the quantum-mechanical prediction for the correlation function. 
If many measurements are made with SG1 and SG2 at fixed angles 6; and 62, we 
can define the corresponding experimental quantity D(@, — 62) as follows: 


D(6; — 62) = + the proportion of measurements with both spin-up 
+ the proportion of measurements with both spin-down 
— the proportion of measurements with particle 1 up and particle 2 down 
— the proportion of measurements with particle 2 up and particle 1 down. 


If, in the limit of many measurements, D(@, — 62) approaches C (01 — 62) for all 
(81 — 82), we would say that the quantum-mechanical prediction is confirmed. 


6.3.5 Bell’s inequalities 


Stepping outside the standard formalism of quantum mechanics, John Bell asked a 
very pertinent question: if hidden variables did exist, would that impose any 
restrictions on the correlation function, C(02 — 01)? 


In 1964, Bell established an inequality that must be satisfied for any hidden 
variable theory within a broad class known as local hidden-variable theories. 
Roughly speaking, these are hidden-variable theories that do not include any 
non-local effects. More precisely, the assumptions made by Bell in establishing 
his inequality were realism and locality. In this context, 


Realism implies that observables have values independently of any 
measurement. 


Locality implies that events at any location cannot influence what happens at 
another location before a light signal could travel between the two locations. 


The main attraction of hidden variable theories would be that they rescue realism, 
but most proponents of hidden variables would not want this to be achieved at the 
expense of sacrificing locality. Taken together, the assumptions made by Bell are 
commonly referred to as local-realism. 


In the context of Bohm’s hypothetical experiment, Bell showed that, for any local 
hidden-variable theory (a theory embodying local-realism), logic imposes a 
constraint on the possible values of the correlation function, C'(@ — 01). Note that 
this is not a quantum-mechanical result. It is the result of an entirely classical 
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analysis, telling us what local hidden-variable theories are capable of explaining. 
In its original form, the inequality derived by Bell was not well-adapted to 
comparing theory with experiment. However, there are various similar 
inequalities, all of which we shall call Bell’s inequalities. In particular, inspired 
by Bell’s work, Clauser, Horne, Shimony and Holt in 1969 used the same 
assumption of local-realism to prove a form of Bell’s inequality called the CHSH 
inequality; this is very suitable for comparing theory with experiment. 


By Bohm-type measurements, The CHSH inequality. Consider a series of Bohm-type measurements made 
we simply refer to with SG1 aligned at two angles 6, and 6, and SG2 aligned at two angles 62 
measurements like those we and 95. That makes four possible kinds of correlation function, which can be 
have been discussing, involving combined to form the sum 
the set-up shown in Figure 6.4. Y= C(61 — 62) + C(6. — 6) + C(6!, — 42) — C(6,, — 6). (6.20) 
CHSH proved, on the basis of assumptions equivalent to local-realism, that 

E < 2. (6.21) 
A proof of the CHSH inequality 
can be found on the course This form of Bell’s inequality is known as the CHSH inequality. Because the 
website. arguments are quite lengthy and because no quantum mechanics is involved, we 


shall not prove the CHSH inequality here, but we shall make use of its existence. 


The comparison is between measured D and predicted C: if the measured 
values of D(@) allow us to conclude that || > 2 for particular angles, then the 
assumptions upon which CHSH based their derivation cannot hold. In that case, 
either realism or locality would have to be rejected. In effect, either hidden 
variables would not exist, or if they did exist, they would be non-local. So values 
of © greater than 2 would rule out local hidden-variable theories. 


Now, the key question is: does quantum mechanics violate the CHSH inequality? 
That is, does quantum mechanics predict that |“1| > 2 for some possible choice of 
angles? You can answer this in the following exercise. 


Exercise 6.6 What would quantum theory predict for X for the following 
angles: 0; = 0°, 6, = 90°, 02 = 45°, 04 = —45°? B 


So quantum theory predicts that for some combinations of angles, |X| > 2, as 
exemplified for the angles in the exercise above, for which |X| = 2/2. It follows, 
according to CHSH, that the results of quantum-mechanical measurements cannot 
be explained by hidden-variable theories. This result is known as Bell’s theorem. 


Bell’s theorem: No physical theory involving local hidden variables can 
reproduce all the predictions of quantum mechanics. 


In 1975, the physicist Henry Stapp was moved to call this ‘the most profound 
discovery in the history of science’. The full implications of Bell’s theorem are 
discussed by physicists and philosophers in hundreds of papers every year and 
we cannot do more than state the overwhelming consensus that experiments 
confirming the quantum-mechanical predictions for the spin correlations, and 
thereby exhibiting breaches of the CHSH inequality, preclude the possibility of 
local hidden variables. There is a deep non-locality in Nature. An electron 
really does not have a value of its spin component on an arbitrary axis until it is 
measured by a Stern—Gerlach or equivalent apparatus to have such a value. 
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6.4 Experiments testing for hidden variables 
with photons 


Despite the great success of quantum mechanics, it is just possible that quantum 
mechanics is wrong in just those places where its predictions disagree with Bell’s 
inequalities. To rule out this possibility we must carry out real experiments 

and see whether the results agree with quantum mechanics or with local 
hidden-variable theories. We refer to such experiments, that necessarily involve 
entangled states of particles, as Bell’s inequality (or EPR) experiments. 


Most actual Bell’s inequality experiments have been carried out with polarized 
photons and not with atoms or electrons. The most famous were those of Aspect 
and colleagues in Paris, conducted between 1978 and 1982. In order to relate such 
photon experiments to what we have just said about spin-3 particles, this section 
begins with an account of the key properties of entangled states of polarized 
photons, starting with the quantum theory of photon polarization. 


6.4.1 The quantum mechanics of polarized photons 


Polarization of light according to classical electromagnetic theory 


Light is a transverse electromagnetic wave, meaning that the electric and magnetic 
fields oscillate at right angles to the direction of propagation. This means that 
light can be linearly polarized, in which case the electric field is restricted to 
oscillating along some fixed direction (Figure 6.5a). In unpolarized light, the 
electric field oscillates in random directions perpendicular to the direction of 
propagation. Unpolarized light, such as light from an old-fashioned incandescent 
bulb, can be polarized by passing it through a sheet of Polaroid. 


E 


unpolarized: 


direction of E along random z 
propagation direction 
E 
E along J 
z-direction A , 
polarizer 
y y 


(a) (b) 


E along 
z-direction 


Figure 6.5 (a) The 
relationship between the 
direction of propagation along 
the y-direction, and the 
oscillating electric and magnetic 
fields for electromagnetic 
radiation polarized in the 
z-direction. (b) Unpolarized 
light propagating in the 
y-direction has the direction of 
the electric field fluctuating 
randomly in the xz-plane. After 
passing through a piece of 
polarizing material such as 
Polaroid, only the light with the 
electric field oscillating along 
the direction of the polarizer axis 
(shown by the white two-headed 
arrow) is transmitted. (c) If the 
light is now passed through a 
further polarizer rotated through 
angle 0, only the component 
having an electric field in this 
rotated direction is transmitted. 


polarizer rotated 
~~ through angle @ 
about y-direction 


E: only surviving 
component 
is along 
direction 
of polarizer 
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In Figure 6.5b, unpolarized light passes through a piece of Polaroid whose 

polarizer axis is marked with a white two-headed arrow pointing in the 

+ z-direction. It is along this direction that the electric field oscillates after passing 
through the polarizer. If the light is then passed through a second polarizer, as in 
Figure 6.5(c), made in the same way but rotated about the y-direction so that 
the polarizer axis now makes an angle 0 with the z-axis, then the intensity of 

Etienne-Louis Malus, the light that is transmitted varies as cos? 0; this is known as Malus’s law. If 

1775-1812. the two polarizers are orthogonal (‘crossed’), so that 6 = 7/2, then the 
transmitted intensity is zero. Note that 0 = m and 0 = 0 both correspond to 
complete transmission: there is nothing to distinguish an electric field oscillating 
‘up-and-down’ from one oscillating ‘down-and-up’ — hence the two-headed 
arrows on the Polaroids in Figure 6.5. This might seem a trivial point, but it is 
very different from what we saw for spin-4 particles, where the equivalent of 
Malus’s law was the cos? (0/2) transmission by a pair of Stern—Gerlach analyzers, 
and where spin-up was clearly distinguished from spin-down. 


Polarization of light according to quantum theory 


Although a photon is a spin-1 particle, it does not have the expected three possible 
Classically, we can also say that, values of spin component along a specific axis. For reasons to do with their 


since light is a transverse masslessness and relativity, photons have only two possible components of spin: 
wave, it cannot be polarized in +h and —fh along the direction of motion of the photon. This means that their 
the direction of propagation; quantum-mechanical description is rather like that for electrons, but with a 
there are just two independent difference. 


directions of polarization 
perpendicular to the direction of 
propagation. 


Having in mind propagation along the y-direction, let us define two basis states of 
linear polarization: |V} and |H} (V for vertical, along the z-axis, and H for 
horizontal, along the x-axis). A photon in state |V} is 100% certain of passing 
through a Polaroid with its polarizer axis oriented in the z-direction, and certain 
not to pass through one oriented in the x-direction. Likewise, a photon in state |H) 
passes through a Polaroid oriented in the x-direction, but not one oriented in the 
z-direction. Any state of polarization can be described by the linear combination 


|general) = a|H) + d|V), (6.22) 


where a and b are complex probability amplitudes, subject only to the 
normalization condition |a|? + |b|? = 1. The states |H) and |V} form a complete 
orthonormal basis for describing the polarization of photons, so that: 


e any state of polarization can be expressed in the form of Equation 6.22 
(completeness), 


e (V|H) = (H|V) = 0 (orthogonality), 
e (V|V) = (H|H) = 1 (normalization). 
A photon in the state given by the superposition 
|Vo) = cos 0 |V} + sin 0 |H) (6.23) 


has the property that it is assured of passage through a Polaroid with its polarizer 
axis at an angle @ to the z-axis and (90° — 0) to the x-axis. We say it represents a 
photon that is vertically polarized at angle 0 (Figure 6.6). 
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g Figure 6.6 A sheet of 
Polaroid oriented at an 
angle of 0 to the z-axis 
. (the two-headed arrow 
/ Me labelled Vg) passes light 
8 vertically polarized in a 
plane at an angle of 6 to the 
z-axis. Note that light that is 
as vertically polarized at an 
He angle of (9 + 7/2) is also 
horizontally polarized relative 
to the Polaroid at angle 0. 
This figure presumes light 
incident along the y-direction. 


RY 


A photon that is vertically polarized at an angle of 0 + 7/2 is horizontally 
polarized with respect to angle 0, and is represented by 


|[Ho) = |Vo4n/2) = cos(O + 1/2)|V) + sin(6 + 7/2)|H) 
= —sin6|V) + cos6|H). (6.24) 
@ Is the state |Vg) normalized to unity? 
O Yes, since (V|V) = (H|H) = 1 and (V|H) = (H|V) = 0, we obtain 
(Vo|Vo) = (cos 0 (V| + sin 6 (H|) (cos 6|V) + sin 6|H)) 
= cos? 0 + sin? 0 = 1. 


The probability amplitude for a photon in state |general) of Equation 6.22 to be 
transmitted through a polarizer oriented in the x-direction is 


(H|general) = (H|(a|/H) + 0/V)) = a(H|H) + b(H|V} = a, 
so the transmission probability is 
transmission probability = | (H|general) |” = |aļ?. (6.25) 


The corresponding probability for transmission through a vertically-oriented 
Polaroid is |(V|general) |? = |b|?. In each case, the transmitted photon is, in 
accord with Principle 6a of Section 5.6, in the state defined by the Polaroid. 
That is, a photon initially in the state |general) has a probability |a|? of being 
transmitted by a horizontally-oriented Polaroid, and if it is transmitted, its state 
will have collapsed to state |H); it will be a horizontally-polarized photon. 
Needless to say, the usual quantum caveats apply: e.g. a photon cannot be said to 
be transmitted until it has been detected on the far side of the Polaroid. 


Now we can derive Malus’s law: when a photon that has been prepared in the 
state |Vg) (Equation 6.23) is subject to a polarization measurement with a 
polarizer in the z-direction, the probability of a photon being transmitted (in the 


165 


Chapter 6 Quantum entanglement and the EPR argument 


Malus’s law for light 

is reminiscent of the 
cos”(@/2)-rule for spin-5 
particles but involves 0 rather 
than 6/2. 


The definitions of right- and 


left-handed circular polarization 


are not universal: we use the 
convention adopted in optics, 
electrical engineers use the 
opposite convention. 


Photons cannot have zero 
angular momentum in the 
direction of propagation, for 
reasons arising from their 
masslessness and special 
relativity. 
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state |V)) is 
transmission probability = KVIVo)|’ 
= |(V|( cos 6|V) + sin 6[H)) |? 
= | cos 0 (V|V} + sin 0 (V|H)|? 
= cos? 0. 
This verifies Malus’s law for incoming photons polarized at angle 0 incident on a 
vertical analyzer, but physically the transmission can depend only upon the angle 


between the plane of polarization and the orientation of the Polaroid, so Malus’s 
law holds in general. 


6.4.2 Circular polarization of light 


The fact that photons obey a cos? @ rule, rather than the cos?(0/2) rule for spin-4 
particles, has its origin in the fact that photons are massless spin-1 particles. As it 
turns out, there are two states that do have well-defined values, +h, of spin 
angular momentum in the direction of propagation of the photons. To define these 
we have to consider states with complex probability amplitudes a and b. Real 
amplitudes a = cos 0 and b = sin 0 allow us to describe any possible ‘linear 
polarization’ of a photon. But we know that probability amplitudes are complex in 
general, so what states might 


1 
v2 


represent? 


(\H) +i]V)) (6.26) 


We use the term linear polarization to describe the state produced by a simple 
sheet of Polaroid, but there are indeed other forms of polarization, one of which is 
circular polarization. Classically, circular polarization means that the electric 
field vector E is actually rotating about the direction of propagation, the y-axis in 
Figure 6.5. Such polarization is produced by passing plane-polarized light 
through a transparent plate that introduces an appropriate phase difference 
between the vertically- and horizontally-polarized components. That phase is 
exactly what the +i = e+i7/? term provides in the quantum states presented in 
Equation 6.26. We define states of ‘right-handed circular polarization’ |R) and 
‘left-handed circular polarization’ |L) as follows: 

1 1 


|R) = JaN tiv) IL) Ai 


Although photons are spin-1 particles, their spin component has only two (instead 
of three) possible values. The angular momentum carried by a right-hand 
circularly-polarized photon, |R}, is —h along the direction of propagation, and the 
angular momentum carried by a left-hand circularly-polarized photon, |L), is +h 
along the direction of propagation, here the y-axis. 


(IH) —iJV)). (6.27) 


Exercise 6.7 Verify that |R} and |L} are orthonormal. E 


If a circularly-polarized photon is absorbed or emitted by an atom, the internal 
angular momentum of the atom changes by +A in the direction in which the 
photon is moving, and this is often very useful in experiments. 


6.4 Experiments testing for hidden variables with photons 


6.4.3 The Aspect experiments 


In the now celebrated 1980s experiments of Alain Aspect and his colleagues, two 
entangled photons were produced from the decay of a calcium atom in an excited 
state having zero angular momentum. This state of calcium is forbidden by 
selection rules from decaying directly to the ground state which also has zero 
angular momentum, and decays first to an excited state which promptly decays to 
the ground state. The net angular momentum carried off when the atom jumps in 
two steps from one state with zero angular momentum to another must be zero. 
The two photons that are emitted in rapid succession in this process are entangled 
(Figure 6.7). 


A pair of photons travelling in opposite directions along the direction of the y-axis 
would carry zero angular momentum if one had angular momentum +A in the 
y-direction and the other —fA in the y-direction. Because they are going in 
opposite directions, this means they would have the same circular polarization. 


It turns out that the entangled state of two photons, when written in terms of |V) 
and |H) for each photon, is 
1 
v2 
This last equation uses the positional notation whereby the first V or H refers to 


the photon that appears in detector 1 and the second refers to the photon that 
appears in detector 2. 


|photon pair) = —= (|VV) + |HH)). (6.28) 


Equation 6.28 shows that if one photon is found to have vertical polarization 
along the z-direction, then the other will also be found to be vertically polarized. 
If one rotates the angle along which vertical polarization is measured through 
some angle 0, defining the new states represented by |Vọ) and |Hg), then the 
above equation can further be written 


1 
va 


This is analogous to the fact that a singlet state for spin-4 particles looks similar 
in all bases. 


|photon pair) = —=(|VoVo) + |HeHe)). (6.29) 


Exercise 6.8 Using Equations 6.23 and 6.24, show that 


1 1 
Z ally) + |HH)). a 


This is now rather like the case of two spin-4 particles entangled in a singlet state. 
In this case, if one photon is found to be vertically polarized in any direction, the 
other is certain to be found to be vertically polarized in the same direction. As in 
the case of spin-5 particles, the crucial angles for confronting hidden-variable 
theories are not the angles where these perfect correlations occur. Indeed, the 
CHSH inequality applies, but with a slight twist. The clue comes from comparing 
Malus’s law in Section 6.4.1 (involving cos? 0) with the comparable expression 
for spin-3 particles from Chapter 3, and also Equation 6.13 (involving cos? (6/2)). 
It turns out that the equations involving spin-1 photons are obtained from the 
corresponding spin-5 equations by replacing 6/2 by 0. Thus the key quantity is 


([VoVo) + |HeHs)) = 


More recently, a more efficient 
way of producing entangled 
pairs of photons has superseded 
the method used by Aspect; it is 
described in the DVD video on 
quantum information. 


J=0 
J=1 
J = 0 


Figure 6.7 The atom jumps 
in two steps from the second 
excited state with angular 
momentum quantum number 

J = 0, via the first excited state 
with J = 1, to the ground 

state with J = 0. (J isa 
quantum number for total 
angular momentum as you will 
see in Book 3.) An atom cannot 
jump between two J = 0 states 
by emitting a single photon, 
since the photon has angular 
momentum, but it can do so by 
emitting two photons with 
opposite angular momenta. 
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again the correlation function 


Cyhoton(@) = + the probability of measuring both vertical 
+ the probability of measuring both horizontal 
— the probability of measuring particle 1 vertical 
and particle 2 horizontal 
— the probability of measuring particle 2 vertical 
and particle 1 horizontal. 


The quantum-mechanical prediction for this correlation function is 
Cphoton (9) = cos(20), (6.30) 


and this must be compared with the experimental quantity, Dphoton (0), which is 
similar to Cphoton(0) but with probabilities replaced by proportions of 
measurements. 


The CHSH inequality || < 2 (Equation 6.21) applies to local hidden-variable 
theories of photons exactly as for spin-5 particles; X is again defined by 
Equation 6.20, but now with Cpnoton(9)- 


Aspect and his collaborators measured Dphoton(@) not just for a few angles, 
but over a wide range of angles, comparing it with the quantum-mechanical 
Chhoton(9), getting a perfect fit. From these results they could show that X 
reached 2v2, far beyond the CHSH limit of 2, verifying the profound non-locality 
You may wish to view the of Nature. 
accompanying DVD which 
contains a film of the Aspect 
experiments, including 
interviews with John Bell and 
Alain Aspect. This archive film 
quotes yet another form of Bell’s 
inequality: 
|[3C (0) — C(30)| < 2. 


In connection with the next exercise, recall that a photon moving along the y-axis 
in state |V) is polarized in a plane including the z-axis, and a photon in state 

|H) is polarized in a plane including the x-axis. A photon in the state | Vg) of 
Equation 6.23 is polarized in a plane at angle 0 to the z-axis. From Equation 6.23, 
such a photon with 6 = 90° is in the state |H). This is natural: a photon ‘vertically 
polarized relative to the x-direction’ is ‘horizontally polarized relative to the 
z-direction’. 


Exercise 6.9 Many pairs of photons are prepared in the entangled state of 
Equation 6.28. They travel in opposite directions along the y-axis, and fall on 
sheets of Polaroid, one close to the source of photons and one further away (in 
opposite directions from the source). Discuss, in terms of entanglement and the 
collapse of the state vector, what is observed behind the ‘far’ Polaroid in the 
following cases. 


(a) The near Polaroid passes photons vertically polarized in the z-direction, and 
the far Polaroid is oriented to pass photons polarized in the x-direction. 


(b) The near Polaroid passes photons vertically polarized in the x-direction, and 
the far Polaroid is oriented to pass photons polarized in the x-direction. 


(c) The near Polaroid passes photons vertically polarized in the z-direction, and 
the far Polaroid is oriented to pass photons polarized in a plane at 45° to the 
x-direction. | 


The Aspect experiments did not study the polarization of the photons using sheets 
of Polaroid, which either pass the photon or not, just as a Stern—Gerlach preparer 
with one exit blocked either passes a spin-4 particle or not. Instead, devices 
known as polarizing beam splitters were used, which send a photon in one of 
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two directions depending on whether the photon is vertically or horizontally 
polarized. This is analogous to a Stern—Gerlach analyzer that deflects spin-up and 
spin-down atoms in two directions. Polarizing beam splitters will be described in 
the next chapter. 


6.4.4 Experiments with sets of three entangled particles 


Experiments like those of Aspect, that verify non-locality and provide evidence 
against local hidden variables, all involve statistical measurements. The 
polarizations of many pairs of particles are measured, and, from the correlations 
between the photons that are measured to be horizontally and vertically polarized, 
the quantities D(0, — 62), D(@; — 05), D(@4, — 02) and D(01 — 04) are determined. 
Knowing these, it is possible to test whether there are sets of angles for which the 
CHSH inequality |X| < 2 (Equation 6.21) is disobeyed. As we know, quantum 
theory says there are such angles. Since only a small fraction of the emitted 
photons can actually be detected in such experiments, some people have said that 
(however strange this might seem) the apparatus is somehow selecting pairs of 
photons which weight the statistics to make the measured |X| exceed 2. 


It is the existence of such alleged loopholes that gives a special value to tests of 
hidden variables that do not depend on statistics, and for which, in principle, a 
single measurement would suffice to exclude hidden variables. Such a test was 
proposed by Greenberger, Horne and Zeilinger in 1989. In its original form, it 
was based on special entangled states of three spin-4 particles. These GHZ 
states, as they are referred to, take the form, for spin-5 particles, 
1 i Note that we have now extended 
27° 29° 2 )). (6.31) positional notation to states of 
three particles. 


|GHZ) = 


This is a superposition of two states: the first has all three particles spin-up with 
respect to the z-axis, i.e. with ms = +4, and the second has all three particles 
spin-down, with mg = — 5. The particles travel in the y-direction and the x-axis 
is then defined as the third axis of a right-handed coordinate system. 


Exercise 6.10 Assume that the z-component of the spin of one particle in the 
three-particle entangled state of Equation 6.31 has been measured and found to be 
positive. What would the z-components of the spins of the other two particles turn 
out to be if they were measured? What would be the result if the spin component 
of the first particle turned out to be negative? Explain in terms of the collapse of 
state vectors. E 


The test conceived by GHZ is not obvious, and involves measuring the spin of the 

three particles in the x-direction rather than the z-direction. The test involves three You may wish to look at the 
spin-5 particles in the GHZ state, Equation 6.31. Consider the case that the spin in article by N. D. Mermin in the 
the x-direction is measured for all three particles. GHZ showed that if there were American Journal of Physics 
hidden variables, and no non-local effects, then there would always be an odd (1990) vol. 58, page 731, which 
number of particles with spin-up in the z-direction. In other words, combinations gives the derivation of the 

of Sx measurements such as +5h, —5h, — sh and +5h, +4ħ, +4ħ, in which the results summarized here. 
product of the spins is positive, would necessarily occur. But quantum mechanics 

implies that an odd number of particles with spin-down in the x-direction must 

always be measured, e.g. —sh, = 5h, = xh. A single experimental measurement in 
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which the product of the spins in the x-direction is negative would be evidence of 
non-locality and evidence against local hidden variables. 


Two different views can be taken concerning this result. 


1. GHZ have found a contradiction between quantum mechanics and local 
hidden-variable theories. Since we believe quantum mechanics, and the question 


Einstein was always clear of local hidden variables was raised only in connection with the completeness of 
that he did not doubt the quantum mechanics, that is all there is to say: there cannot be local hidden 
practical correctness of variables. 


quantum mechanics; it was its 


f 2. On the other hand, it might be possible that quantum mechanics is correct 
completeness that he questioned. 


everywhere except just those places where specific consequences of hidden 
variables become evident. Hence it is worth verifying by experiment that 
quantum mechanics always holds true, even in the cases where it contradicts 
hidden-variable theories. 


It is just the second point of view that inspired Aspect and others to carry out the 
experiments discussed in Section 6.4.3. More recently, experiments were carried 
out on GHZ states, and these verify the predictions of quantum mechanics. As 
with the experiments with entangled pairs, these were also carried out with 
photons rather than spin-4 particles; the GHZ state for three photons was 


1 
B 


The three photons are in a superposition of two states: |V, V, V) in which all 
three are vertically polarized, and |H, H, H) in which all three are horizontally 
polarized. 


(IV, V, V) + |H,H,H)). (6.32) 


In the experiment, the polarization of the three photons is measured along an axis 
rotated by 45° to the original axis. We denote by H’ and V’ horizontal and vertical 
polarization along this new axis. The alternatives for this system are as follows. 


Hidden variables If the polarizations have values prior to measurement 
(i.e. there are local hidden variables), then the following combinations of 
measured polarizations are possible: V’V'V’, H’H’V’, H'V'H’ and V’H’'H’; 
i.e. there is an odd number of vertically-polarized photons. 


Quantum mechanics There is an odd number of horizontally-polarized 
y-p 
photons, i.e. the permitted combinations are: H’H’H’, H’V'V’, V’H’V’ and 
V'V'H’. 


In Figure 6.8 we show you what was found by Pan et al. in 2000. 


This was a difficult experiment, and the results seen in Figure 6.8(c) agree with 
the quantum theory predictions in Figure 6.8(a) to within experimental error. The 
small numbers of counts in the ‘wrong’ bins are consistent with predictions 

that take into account the known experimental difficulties. The experimental 
uncertainties are not related to underlying quantum probabilities. The detection 
efficiencies were small, and so the ‘fair sampling’ hypothesis assumption is still 


You may wish to look at the invoked. With this in mind, these results are a clear vindication of quantum 
paper by Jian-Weh Pan et al. theory, and strong evidence that Nature is non-local and that there are no local 
in Nature (2000) vol. 403, hidden variables. If observables had values prior to measurement, then the 
page 515. experimental results would have been like part (b) of the figure, not part (a). 
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6.5 The significance of entanglement 


0.25+ HY WHy' SA Hi H'H'H' 
This chapter has introduced entanglement, and shown how a 
it makes very fundamental questions about the nature of reality 0.20/- 
a matter for experimental study. The results of some experiments 
can only be understood as the results of non-local effects, and s 0.15} 
in terms of the fact that some quantities do not have values prior E 
to measurement. Einstein had expressed the views of a scientist & 0.10 
who thought very hard and clearly about the implications of 
deeply held common-sense views about the world. The fact that 0.05f 
experiments performed years later clearly contradict these views 
is a measure of just how startling quantum mechanics really is. 0.00 
But these conceptual results are very far from exhausting the @) 


great current interest in the consequence of entanglement. oas ee ie ee 


Here is a quote that gives some measure of the interest: 


‘Entanglement is a uniquely quantum-mechanical resource 0o 

that plays a key role in many of the most interesting M 

applications of quantum computation and quantum a 4 

information; entanglement is iron to the classical world’s S 

bronze age. In recent years there has been a tremendous el 

effort trying to better understand the properties of 0.051 

entanglement considered a fundamental resource of Nature, ik i 

of comparable importance to energy, information, entropy, 0.00 | 

or any other fundamental resource.’ (b) 

Nielsen and Chuang, Quantum computation TE 
and quantum information 0.257 yv HAD ae 
Elsewhere, it has been suggested that studying the applications 0.20} ae 
of entanglement will lead to a new understanding of Nature, 
just as the study of the efficiency of steam engines lead Carnot g V15F 
and others to fundamental laws of thermodynamics. g 
H A 
The next chapter will give an account of some of the oa 
contemporary applications of entanglement that have aroused 0.056 E E 
intense interest around the world. paa E [| 
0.00 [Œ [ 


Figure 6.8 Histograms showing the fraction 
of the measurements of the polarization of 
three GHZ photons in each of eight possible 
combinations: (a) the results predicted by 
quantum mechanics; (b) the predictions of 
hidden-variable theories; (c) the experimental 
results. 
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Summary of Chapter 6 


Section 6.1 Do quantum systems have properties before they are measured? 
This fundamental question, raised in this section, is an ongoing question for the 
rest of the chapter. 


Section 6.2 A two-particle system is in an entangled state if its state vector 
cannot be expressed as a product of terms representing each particle, and so 
neither particle on its own has definite properties, though the pair does. Entangled 
states exhibit non-locality — Einstein’s ‘spooky action at a distance’ — a deeply 
non-classical property. An example is provided by the singlet state of a pair of 
spin-5 particles. A significant property of singlet states is that they have a similar 
form in all bases; as a result, for whatever angle the spin component of one 
particle is measured, the spin component of the second particle, measured at the 
same angle, will be the opposite. Entanglement is not restricted to the spin states 
of pairs of particles; it also applies to spatial states and to photons. 


Section 6.3 Entanglement can be studied experimentally. Bohm’s hypothetical 
experiment involves the singlet state of a pair of spin- particles. According to 
quantum mechanics, such a pair of singlet states exhibits correlations that cannot 
be exhibited by any system governed by local hidden variables (Bell’s theorem). 
In this sense, Nature is non-local. The question is whether Nature obeys the laws 
of quantum mechanics or satisfies some limits — Bell ’s inequalities — satisfied by 
any system with local hidden variables. 


Section 6.4 Experiments to test for hidden variables have been carried out 
involving polarized light (photons). The quantum description of such polarization 
is rather like the quantum description of spin-4 particles, with some differences 
(mainly 0/2 — 0). Two experiments are described: the classic experiments 

of Aspect and colleagues carried out in the early 1980s, and the more recent 
experiments involving entangled states of three photons. In each case, the 
quantum-mechanical predictions are decisively reproduced: Nature is non-local 
and there are no local hidden variables. 


Section 6.5 Entanglement is of great current interest and may be regarded as a 
fundamental resource of nature. Practical applications will be discussed in the 
next chapter. 
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Achievements from Chapter 6 


After studying this chapter, you should be able to: 


6.1 Explain the meanings of the newly defined (emboldened) terms and 
symbols, and use them appropriately. 


6.2 Briefly outline the issues relating to the possible existence of hidden 
variables. 


6.3 Explain what it means to say that the singlet state of two spin-3 particles is 
entangled. 


6.4 Give two further examples of entangled systems, apart from the singlet state 
of two spin-4 particles. 


6.5 Explain what it means to say that the singlet state of two spin-4 particles 
takes a similar form in all bases, demonstrate this fact, and explain the 
consequences for the measurement of the spins of two particles in such a 
state. 


6.6 Give an account of Bohm’s hypothetical experiment based on spin 
measurements on the singlet state of two spin-4 particles; in particular, give 
an account of the role of entanglement. 


6.7 Apply the expression for the ket representing the spin state for a spin-5 
particle that is spin-up in direction n to derive the correlation function C (8). 


6.8 Interpret the CHSH inequality, and explain its significance for Bohm’s 
experiment and the evidence for non-locality. 


6.9 Appreciate the key features of linear and circular polarization of light and 
the representation of the polarization states of photons. 


6.10 Give an account of the Aspect experiments, and the significance of what was 
found for our understanding of quantum mechanics. 


6.11 Give an account of the experiments involving the GHZ states of three 
entangled particles. 
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Introduction: quantum entanglement put to work 


The previous chapter was devoted to the nature of quantum entanglement, to some 
experiments that reveal the remarkable consequences of entanglement, and to a 
discussion of how the existence of entanglement influences our interpretation of 
quantum mechanics. In this chapter we introduce some applications of 
entanglement. The applications of entanglement that have attracted the attention 
of physicists around the world, and of large companies too, include quantum 
cryptography, quantum teleportation and quantum computing. Together these are 
referred to as quantum information, giving this chapter its title. 


Little in this chapter could have been included in a quantum mechanics course 
written two decades ago. In Chapter 6 we presented a figure that showed the 
continuing rise in the number of citations of the paper by Einstein, Podolsky and 
Rosen (EPR) that first brought the remarkable consequences of entanglement to 
light. This dramatic rise partly reflects interest in the deeper understanding of 
quantum mechanics; but it also reflects the intense worldwide commitment to 
research in a number of fields that seek to exploit entanglement for practical 
purposes. This research is not just carried out in universities; companies too are 
investing in entanglement research, as you will see in the DVD film that we 
recommend you view towards the end of this chapter. 


Section 7.1 reviews some of the key concepts from Chapter 6 concerning the 
polarization of photons, adding some extensions and notation that will be used 
throughout the chapter. Section 7.2 introduces quantum cryptography, a technique 
that is already becoming increasingly important to banks, for the internet, and in 
fact to all those for whom secure communications are important. The chapter 


In computing, a protocol is a introduces two ‘protocols’ for quantum cryptography, one of which (BB84) does 
standard or convention that not, and one which does, involve entanglement. Section 7.3 introduces quantum 
enables data transfer to take teleportation. This was first achieved in the final years of the last century, and is 
place. still of considerable interest worldwide. We explain why we are still very far from 


being able to beam people from place to place, but what has been achieved is 
impressive nevertheless. Much of the interest in quantum cryptography and 
teleportation arises from the importance of the technologies that are needed 

for quantum computing. Section 7.4 is a very brief introduction to quantum 
computing, a topic that has aroused intense research activity throughout the world. 
The amount of background information on computing that we would have had to 
impart makes it impossible for us to give more than a brief glimpse of both the 
promise and the difficulties to be faced. 


7.1 Photon polarization revisited 


In this section we revisit the description of polarized photons that we met in 
Section 6.4 of Chapter 6, and introduce a matrix representation that simplifies 
some calculations. A simple method of preparing light in a given state of linear 
polarization is to pass it through a Polaroid filter. For incoming classical 
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electromagnetic waves, a Polaroid filter transmits only the component of the 
electromagnetic waves with the electric field along the polarizer axis. 


The applications to be described in this chapter call for a polarization analyzer 
in the form of a polarizing beam splitter that separates the incoming 
electromagnetic field into two components: one with the electric field along, and 
one with the electric field perpendicular to, a fixed polarizer axis. Having passed 
through the polarizing beam splitter, light, polarized in the two perpendicular 
directions, leaves the polarization analyzer in two different directions. A 
Wollaston prism (Figure 7.1) is one example of a polarizing beam splitter. 
Polarizing beam splitters can be used either to prepare light with a defined 
polarization, or to analyze the polarization of incoming light. 


The polarizing beam splitter 
that you will see in the 
accompanying film ‘Quantum 
information’ on the DVD is not 
a Wollaston prism but one in 
which the horizontally- and 
vertically-polarized components 
emerge in perpendicular 
directions. 


polarizer 


Figure 7.1 A Wollaston prism separates incoming light into components of 
orthogonal polarizations which propagate at different angles upon emerging 
from the prism. One component V (shown in blue) is polarized in the direction 
along the polarizer axis. The other component H (shown in green) is polarized 
perpendicular to the polarizer axis. The prism consists of two ‘birefringent’ 
calcite crystals, cemented together on their base to form two right-angled 
triangular prisms with perpendicular ‘anisotropy axes’. It separates light into two 
polarization components just as a Stern—Gerlach apparatus separates spin-5 
particles into spin-up and spin-down components. 


For linearly-polarized light entering a polarizing beam splitter, the intensity of the 
emerging light that is polarized parallel to the polarizer axis follows Malus’s law 
as in Section 6.4.1 of Chapter 6, varying as cos? 0, where 6 is the angle between 
the incoming light polarization direction and the polarizer axis. This is just 

as it is with a Polaroid filter. But whereas a Polaroid filter absorbs a fraction 

sin? 0 = 1 — cos? 0 of the light, none of the total light intensity is lost in an 

ideal polarizing beam splitter. The intensity of the light emerging polarized 
perpendicular to the polarizer axis will vary as sin? 0. In many ways, a polarizing 
beam splitter acts very much like a Stern—Gerlach apparatus, but for photons. 


We shall be concerned in this chapter with the polarization of individual photons. 
In order to treat individual photons, we require an orthonormal basis to describe 
the polarization state, and it will be convenient to use matrix notation. The state of 
a photon that is vertically polarized in the z-direction will be denoted as in 
Chapter 6 by |V), while that of a photon polarized orthogonal to the z-axis is 
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By convention, 0 is always 
measured in the 7z-plane down 
from the z-axis towards the 
x-axis. 


We use the symbol P for the 
linear polarization matrix to 
avoid confusion later with the 
symbol P which represents 
probability. 
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denoted by |H). Using matrix notation analogous to the spinor notation for spin-5 
particles, we write 


v= [5]. p= [tI 11) 


For photons polarized vertically or horizontally relative to a polarizer axis at an 
angle 0 to the z-axis, the corresponding states and matrices are 


cos | 


sin 0 


|Vo) = cos 0 |V} + sin 0 |H) = | 
(7.2) 


y= —sin6[V) + cos@|H) = P 


cos @ 


When 0 = 0°, we recover |Vo) = |V} and |Ho) = |H). 


Exercise 7.l Prove that |Vg) and |Ho) are orthogonal for any value of 0. 


Exercise 7.2 Show that |Vg) and |Hg) are normalized to unity. a 


For a given polarizer axis, set at an angle 0 to the z-axis, we define a linear 
polarization variable P(0) by associating the value of +1 with vertical 
polarization, and the value —1 with horizontal polarization. This observable 
quantity corresponds to a linear Hermitian operator P(0), which turns out to have 
the matrix representation 


Ssa _ |cos(20)  sin(20) 


Pg) = sin(20) — cos(20)| ` a 


We call this the linear polarization matrix: by definition, it is expected to have 
eigenvectors |Vg) and |Hg) with eigenvalues +1 and —1, respectively; the 
following exercise verifies this. 


Exercise 7.3 Show that the matrix P(0) and the eigenvectors |Vg) and |Họ) 
satisfy the eigenvalue equations P (0) |V9) = |Vo) and P(A) |H») = —|He). 
Hint: You'll need to use the following trigonometrical identities: 


cos(A + B) = cos A cos B F sin Asin B, 
sin(A + B) = sin Acos B + cos Asin B. 


You have probably noticed that the matrix in Equation 7.3 and the eigenvectors 
in Equations 7.2 are reminiscent of the equations for the description of 

spin-5 particles in Chapter 3. In fact, replacing 6 with 20 and h/2 with 1 in 

the expressions for the general spin matrix and its eigenvectors (with @ = 0), 
one obtains Equation 7.2 and Equation 7.3. This is no coincidence as these 
equations describe very similar systems — in both cases we are describing 
systems with two eigenvectors onto which any given state vector may collapse. It 
should now be clear why we made the analogy with the Stern—Gerlach 
apparatus when discussing the polarizing beam splitter. Although it will not 
greatly affect our arguments, it is also worth noting that there are differences 
between Stern—Gerlach analyzers and polarizing beam spitters. In contrast to a 
Stern—Gerlach analyzer, a polarizing beam splitter does not cause photons with 
different values of angular momentum to travel along different paths. For reasons 
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related to masslessness and relativity, the only angular momentum component that 
makes sense for a photon is that along its direction of propagation, and this is 
equal to —A for right-handed circular polarization and +A for left-handed circular 
polarization. The different paths taken by vertically- and horizontally-polarized 
light through a polarizing beam splitter are those of linearly polarized photons, so 
are not directly related to different values of angular momentum. 


We turn now to consider the measurement of the polarization of a single linearly 
polarized photon. We assume that this photon is in a general state |y), and that its 
polarization is measured with a polarizer whose axis is at an angle @ to the z-axis. 
An example of such a situation is shown in Figure 7.2. 


polarizer 


axis | Ve} detector 


|Ha} detector 


Figure 7.2 A Wollaston prism can be used as a polarization analyzer by placing 
a photon detector at each of the out-directions. Each emerging photon is registered 
by one or other of the detectors. In this example we take the z-axis to lie along the 
polarization direction of the incoming linearly polarized photons (the solid 
two-headed arrow, not vertical). 0 is then the angle between the incoming photon 
polarization direction and the polarizer axis. For each incoming photon state |), 
vertical polarization (relative to the polarizer axis) is detected with probability 
|(Vo|w) |?, and horizontal polarization is detected with probability |(Hg|7)|?. 


Since the eigenvectors |Vg) and |Hg) in Equation 7.2 form an orthonormal basis 
for linear polarization, we can expand |) as 


|) = a |Vo) + 8 |He), (7.4) 


where |a|? + |8|? = 1 to ensure that |x) is normalized. Suppose that we have two 
detectors after the polarizer (Figure 7.2). Each photon will register in only one 

of the two detectors. A photon detected by the |Vg) (blue) detector will be 
polarized parallel to the polarizer axis, and a photon detected by the |Hg) (green) 
detector will be polarized perpendicular to the polarizer axis. We will denote the 
probability that the |Vg) detector registers by P,.(@), and the probability that 

the |Hg) detector registers by P- (0). We use here the subscripts + to denote 

the eigenvalues +1 which correspond to the two eigenvectors of P(0) (see 
Exercise 7.3). These probabilities can be calculated by taking the modulus 
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squared of the inner product of |y} with each of the eigenvectors |Vg) and |Hg): 


2 
P,(8) = (Volp) = a, (7.5a) 
2 
P_(6) = | (Hal) |" = 6°. (7.5b) 
Recall that after a measurement, The direction of propagation of an emerging photon determines with absolute 
the state of a system (here a certainty which of the two eigenvectors |Vg) and |Hg) describes the emerging 


photon) is an eigenvector of the photon’s polarization. Consequently, a polarizing beam splitter can also be used to 

appropriate operator, here P(@). prepare individual photons with a defined polarization state. Again, we mention 
the analogy with a Stern—Gerlach apparatus preparing spin-5 particles in a 
particular state of spin. 


Exercise 7.4 Photons linearly polarized along the z-axis are incident upon a 
polarizing beam splitter oriented at angle 0 to the z-axis. Show that the probability 
of a photon emerging in the state |Vg) is equal to cos? 0, consistent with Malus’s 
law. a 


When the analyzer axis lies along the z-axis (0 = 0°), the eigenvectors are just 
|Vo) = |V} and |Ho) = |H}. We shall refer to this basis (defined by 6 = 0°) as the 
H/V basis. For the purposes of our discussion of cryptography in the next 
section, it is also useful to introduce a second basis, the eigenvectors of P(6) for 
0 = 45°, corresponding to a polarizing beam splitter oriented at 0 = 45°. These 
eigenvectors are orthonormal (see Exercise 7.1) and are a complete set; we shall 
refer to them as the diagonal basis. Substituting 0 = 45° into Equations 7.2 
shows that the eigenvectors in the diagonal basis are 


|Va5) = 5 H , Hss) = 5 E ; (1.6) 


These eigenvectors describe light that is linearly polarized at +45° to the z-axis. 


These two bases have an important relationship that makes them particularly 
useful for quantum cryptography. Recall that any polarization state of a photon 
|Y) can be expressed as a linear superposition in either of these orthonormal 
bases, as in Equation 7.4. We can therefore expand each of the H/V eigenvectors 
in terms of the diagonal basis vectors, and vice versa. For example, by writing 


|Vas) = a |V) + b |H}, (7.7) 


we can find the values of a and b from a = (V|V45) and b = (H|V45). This gives 
the following relationships between the H/V and diagonal bases: 


Vas) = (iv) + |H)), (7.8a) 
Has) = JV) + |H)), (7.8b) 
V)= Zle) — |H4s)), (7.8c) 
H) = Zis) + |H4s)). (7.8d) 


These relationships can also be written in matrix form, for example 


v-a- aia 2» 
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From these equations, you can see that the two bases, H/V and diagonal, are not 
orthogonal to each other and are, in fact, complementary bases, meaning that an 
eigenvector in one basis has equal projections onto the two eigenvectors of the 
other basis. This implies, for example, that polarization measurements performed 
on an eigenvector of the diagonal basis, when carried out in the H/V basis, will 
yield H and V polarized photons with equal probability. This property is crucial 
in the schemes for quantum cryptography in the next section. 


In the following sections, it will also be useful to be able to express the H/V basis 
eigenvectors in terms of other basis vectors. 


Worked Example 7.1 Essential skill 
Show that the H/V basis may be written in terms of the general eigenvectors Connecting measurement bases 
of P(@), |Vo) and |Hg), as 
|V) = cos 0 |Vọ) — sin 8 |He), (7.10a) 
|H) = sin 8 |Vo) + cos 0 |Ho). (7.10b) 
Solution 


We start by writing |V} and |H) in terms of the eigenvectors of P (0) (see 


Equation 7.4): 
|V) = a |Vo) + b |Ho), (7.11a) 
|H) = c|Vo) + d|He). (7.11b) 


To find expressions for the coefficients a, b, c and d, we use the 
orthonormality of |Vg) and |Ho): 


a = (Vo|V}) = [cos0 sin 6] lo = C, (7.12a) 
b = (He|V) = [—sin®@ cos6] o = sim, (7.12b) 
c = (Vo|H) = [cos sind] H = oie, (7.12c) 
d = (Ho|H) = |- sinð cos6] H = cos. (7.12d) 


Substitution of Equations 7.12 into Equations 7.11 then gives 
Equations 7.10. 


7.2 Quantum cryptography 
7.2.1 Classical and quantum cryptography 


Cryptography is the process of concealing the contents of a message from all 
except the intended recipients, such that a message can be sent between two 
people without risk that the contents of the message can be read by anyone else 
who happens to intercept the message en route. Historically, cryptography has 
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Figure 7.3 Claude Shannon 
(1916-2001) founded the 
subject of information theory, 
which forms the basis of all 
modern communications 
systems, in 1948, but secure 
(one-time pad) encryption 
methods had been used as early 
as 1917, by Gilbert Vernham 
and Joseph Mauborgne. 


The randomness of the key 
destroys any patterns based on 
the frequency of different letters 
in the message, making the 
ciphertext uncrackable. 
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been used to ensure secrecy in important communications such as those of spies, 
military leaders and governments. In today’s information age, cryptography is 
ubiquitous: you probably use cryptography every week for internet banking 
transactions and internet shopping, and cryptographic techniques are increasingly 
employed in consumer electronic devices such as digital media players. 


Modern methods of transmitting information are based upon encoding the 
message in binary form, i.e. as a sequence of 1s and Os, an idea developed by 
Claude Shannon (Figure 7.3). Each item, i.e. 1 or 0, is a single bit. Rendering a 
message into a standard binary code understood by all computers is not 
encryption; the resulting binary form of the message is known as plaintext. 
Shannon established that a message could be encrypted by adding to the plaintext 
a cryptographic key to produce a ciphertext (or cryptogram). The key is a 
randomly chosen string of Os and 1s with the same length as the message. How 
this works is shown in Figure 7.4. 


11001010 | key 
10010110 


ciphertext 


10010110 


Bob 


10010110 
key | 11001010 


Figure 7.4 In this simple example of cryptography, the sender adds each bit of 
the plaintext to the corresponding bit of the key using the rules of binary addition 
(0+0=0,0+1=1,1+0=1,1+1 = 0) to obtain the ciphertext. The 
ciphertext is then transmitted to the receiver who has an exact copy of the key. 
The receiver is able to unlock the plaintext by once more adding the key to the 
ciphertext. Any eavesdropper who listens in but doesn’t have an exact copy of the 
key may intercept the ciphertext but will not be able to recover the plaintext, even 
if she knows the method used for the encryption (simple binary addition in this 
case). 


ciphertext 


Alice 


Once the recipient is in possession of the ciphertext, he can add the key to the 
ciphertext to recover the message. As long as only the sender and receiver are in 
possession of the key, anyone else intercepting the ciphertext will not be able to 
uncover the contents of the message. Shannon was able to show that this type of 
scheme cannot be cracked provided that the key is random and is as long as the 
message, and that the key is used only once (i.e. to send one message); such a key 
is known as a one-time pad. In practice, it is rather impractical to use a key as 
long as the message being sent, so a key length is chosen which ensures that, for 
any eavesdropper intercepting the message, it would take too long to crack 
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the encryption — typical lengths are 128 or 256 bits. Modern cryptographic 
techniques also avoid simple binary addition of the cryptographic key and the 
plaintext, but use some more complicated mathematical operation. 


This brings us to the central problem of cryptography: how to arrange for the 
sender and receiver to have a copy of the cryptographic key without revealing it to 
any eavesdroppers. This is called the key distribution problem. In order to send 
information securely, both sender and receiver must know the cryptographic 

key. However, if the communication channel being used to establish the key 

is monitored by an eavesdropper, the encryption is rendered useless, as the 
eavesdropper will also have a copy of the key and will be able to unlock the 
ciphertext when it is sent. Since all information, including cryptographic keys, 

is encoded in the physical properties of some object or signal, any method 
relying upon classical physics will be vulnerable to eavesdropping, since the 
eavesdropper can measure physical properties without disturbing them, and so the 
eavesdropping may go undetected by the sender and receiver. However, we know 
that measurement of an observable in a quantum-mechanical system necessarily 
disturbs the state of the system — after measurement, the quantum system will be 
in one of the eigenvectors of the observable’s operator. This fact forms the basis 
of quantum cryptography. Gilles Brassard, one of the pioneers of quantum 
cryptography, put it as follows: 


‘Quantum cryptography is the only approach to privacy ever proposed that 
allows two parties (who do not share a long secret key ahead of time) to 
communicate with provably perfect secrecy under the nose of an 
eavesdropper endowed with unlimited computational power and whose 
technology is limited by nothing but the fundamental laws of nature.’ 


To date, most methods of quantum key distribution (QKD) rely upon encoding 
the cryptographic key with the polarization of photons. In the following sections 
we'll explore two different schemes: (i) the BB&4 protocol, which exploits 

the fact that any measurement by an eavesdropper will disturb the state of 

the photons in transmission, and (ii) the Eckert protocol, which exploits the 
non-local correlations of entangled photons. These correlations are destroyed by 
eavesdropping. 


7.2.2 QKD with polarized photons — the BB84 protocol 


Suppose that two people in different locations, Alice and Bob, wish to establish ‘Classical’ channels will 

a cryptographic key in order to communicate securely. Alice and Bob have inevitably involve transistors 
available to them a classical communication channel (such as a telephone and the like that require quantum 
line) which is public — eavesdropping on any communications sent via this mechanics for an understanding 
channel could be undetectable. Alice and Bob also have available a quantum of how they work. However, the 
communication channel over which they seek to establish a cryptographic key information is carried by vast 

by encoding a sequence of bits in the polarization states of individual photons. numbers of electrons or photons, 
Encoding bits in quantum states allows Alice and Bob to determine whether their the behaviour of which can be 
communications have been intercepted. In the following discussion we’ll show described by classical theories. 
how Alice and Bob can establish a common cryptographic key, and ensure ithas Quantum channels typically 

not been discovered by an eavesdropper. The scheme, often referred to as the involve information encoded in 
BB84 protocol, is depicted in Figure 7.5, and was first introduced by Charles single photons. 


Bennett and Gilles Brassard in 1984. 
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Figure 7.5 Alice and Bob 
seek to establish a secure 
cryptographic key and at the 
same time ensure that an 
eavesdropper hasn’t discovered 
the key by listening in on the 
quantum channel. Alice uses 

a quantum communication 
channel to send bits encoded in 
the polarization of photons 

to Bob. Alice and Bob can 
also communicate over a 
public classical channel 

(e.g. a telephone line). Any 
information sent via this 
classical channel is considered 
insecure, since it is not possible 
to ascertain whether or not it 
has been intercepted by an 
eavesdropper. 


Table 7.1 Correspondence for 
encoding bits in the polarization 
of photons. In the H/V basis 
(red), a vertically polarized 
photon, |V}, represents 1, etc. 
The diagonal basis is indicated 
in purple. 


H/V V) |B) 
Diagonal |V45) |H45) 
Bit 1 0 
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classical channel 


photon 
source 
ME > 
quantum 
polarizer channel RA 


In this scheme, Alice sends the key to Bob by encoding bits of data in the 
polarization states of photons. We shall assume that Alice and Bob have 
previously agreed to use a particular correspondence between ones and zeros and 
the measured polarization states of photons. This correspondence is shown in 
Table 7.1. These ‘quantum bits’ are sent over the quantum communication 
channel, which could be an optical fibre that has been designed to preserve the 
polarization of the photons during transmission. 


In the following discussion it is important to bear in mind that neither Alice 
nor Bob knows the cryptographic key prior to going through the process of 
simultaneously generating and sharing it. In brief, the sequence of actions that 
Alice and Bob undertake in order to establish and share the key are as follows. 


1. For each bit to be transmitted, Alice randomly chooses to use either the H/V 
basis or the diagonal basis for transmission. 


2. In her chosen basis, Alice transmits either a 1 or a 0, chosen randomly, 
encoded as in Table 7.1. 


3. Bob randomly chooses either the H/V basis or the diagonal basis to measure 
the polarization of the photon sent by Alice in the previous step. 


4. Steps 1-3 are repeated until a sufficiently long string of random bits has 
been transmitted. 


5. Bob and Alice let each other know over the (insecure) classical channel 
which basis they used for each photon. Bob and Alice discard any bits for 
which they did not employ the same basis. 


6. In order to test for eavesdroppers, Alice and Bob then compare a subset of 
their bit strings. If these bits match, Bob and Alice use the remaining 
undisclosed bits to make up the cryptographic key. 


In order to understand how this process results in a secure key, we need to 
examine each step carefully. For each photon transmitted, Alice chooses either the 
H/V or the diagonal basis at random. Alice also chooses at random whether to 
send a 1 or a 0 bit. Examining Table 7.1, we see that, for example, if Alice has 
chosen the H/V basis and wishes to send a 1, she must send a photon in the state 
|V). As Bob receives the photon he makes a polarization measurement, randomly 
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choosing either the H/V or the diagonal basis to make his measurement. The 
possible results of his measurements are the subject of the following exercise. 


Exercise 7.5 Fill in the bottom row of the following table showing the 
possible outcomes of Bob’s measurements. Where 0 or 1 cannot be predicted with 
certainty, write ‘1 or 0’. The diagonal basis is denoted by a D. 


Alice’s basis H/V | H/V | H/V|H/V | D D D D 
Alice’s sent bit 1 0 1 0 1 0 1 0 
Bob’s basis H/V | H/V | D D | H/V|H/V| D D 
Bob’s detected bit S 


Examining the first and second columns (and the last two columns) in the table, 
for which Bob measured in the same basis as Alice prepared the photon, we see 
that Alice and Bob record the same bit. On the other hand, examining columns 
3—6, for which Bob measures in a different basis to that used by Alice, we see that 
Bob will measure a 1 or a 0 with equal probability, no matter what Alice sends. 
This is because the H/V and diagonal bases are complementary, as defined in the 
last section. As a result, Alice and Bob will record different bit values for half 
(on average) of the occasions when they employ different bases. Bob measures 
the value of the bit sent by Alice with certainty only if he and Alice happen to use 
the same basis. Once Alice has sent her random string of bits, Alice and Bob 
communicate publicly, i.e. over the classical channel, concerning which basis they 
employed for each photon. This information cannot be used by an eavesdropper to 
reconstruct the string of bits sent by Alice. Alice and Bob then discard all bits for 
which they did not employ the same basis. The remaining sequence of bits 
possessed by both Bob and Alice should then be identical copies and so could be 
used as a cryptographic key. 


Before proceeding, you 
must refer to the solution 
to Exercise 7.5. 


We now ask: what happens if an eavesdropper, Eve, intercepts photons sent over 
the quantum channel and makes a polarization measurement in either the H/V or 
the diagonal basis? If she is to stand a chance of remaining undetected by Bob 
and Alice, she’ll have to forward a photon to Bob that is polarized according to 
the outcome of her measurement. For example, if Alice and Bob both use the 
H/V basis, and Alice sends a 0 bit to Bob, which Eve intercepts also in the H/V 
basis, Eve will measure and send on a 0 in the H/V basis, and her eavesdropping 
will go undetected. But if Eve measures in the diagonal basis, she may measure 1 
or 0 with equal probability, and hence stands a 50% chance of sending a wrongly 
polarized photon to Bob. As a result, the keys held by Alice and Bob will no 
longer be identical. If Alice and Bob then compare publicly (over the classical 
channel) a subset of their respective bit strings, they can detect these errors. If 
errors are found, they discard the key. Conversely, if no error is found, the key is 
deemed secure. In this case, Alice and Bob discard the bits they have discussed 
over the classical channel (since this communication may have been eavesdropped 
upon), and they can use the remaining (undisclosed) bits of their strings as the 
cryptographic key. 


Exercise 7.6 A large number of photons are sent by Alice, and the polarization 
of each photon is measured by Eve, who randomly chooses her measurement 
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The scheme we describe is 
based on a proposal due to Artur 
Eckert; we shall call it the 
Eckert protocol. 


Figure 7.6 Quantum key 
distribution with entangled 
photons. A source of entangled 
photons exists between Alice 
and Bob. One photon of each 
pair is sent to Alice, and the 
other is sent to Bob. Alice and 
Bob perform polarization 
measurements on the photons 
they receive. They also have a 
classical channel. Neither Alice 
nor Bob knows the random 
cryptographic key prior to the 
process they go through to 
generate it. 


Alice: 

0 = a= (0%; 22:5°;45°) 
Bob: 

0 = B = (22.5°, 45°, —22.5°) 
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basis. In the hope of being undetected, Eve then transmits to Bob a photon in the 
polarization state that she measures. By constructing a suitable table, determine 
what percentage of the bits in Alice’s and Bob’s strings will differ after they’ve 
discarded all bits for which they used different bases. m 


7.2.3 QKD with entanglement — the Eckert protocol 


Secure quantum key distribution can also be carried out by exploiting pairs of 
entangled photons. As we shall see, the non-local character of entangled states of 
photons allows Alice and Bob to determine with certainty whether or not anyone 
has eavesdropped on their key exchange. In the BB84 cryptography scheme, Alice 
and Bob discarded all measurements in which they used different measurement 
bases, and then, in order to test whether the resulting key was secure, they 
compared a subset of the bits of the key. These ‘test’ bits are subsequently 
discarded. Here we describe a scheme employing pairs of entangled photons, 
which uses some of the measurements made with different bases to ensure the 
security of the cryptographic key. For this scheme, it is not necessary to discard 
bits of the key once it is deemed secure. 


classical channel 


T pE, 


a i A ~~, ; `~ 
& ee K „polarizer A 
A `~ 


pi 


entangled-photon source 


Ja (IVa IHs — 1a 1¥)e) 


ae = 


Suppose that a source producing entangled pairs of photons is located between 
Alice and Bob, as shown in Figure 7.6. One photon of each pair is sent to Bob, 
and the other is sent to Alice. The photons might travel in an optical fibre of a 
quality that preserves the polarization, or even in free space. The method uses the 
general basis |Vg) and |Hg), defined in Equations 7.2, for several values of 0. The 
sequence of events that enables a key to be established is, in outline, as follows. 


1. Alice and Bob both make a polarization measurement on their respective 
photons. Alice chooses randomly from the three bases defined by 
6 = a = (0°, 22.5°, 45°), and Bob chooses randomly from the bases defined 
by 0 = B = (22.5°, 45°, —22.5°). The reasons for choosing these angles 
will become clear later. 
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2. Alice and Bob reveal over the classical communication channel which basis 
they used for each measurement. 


3. Alice and Bob divide their measurements up into two groups: those for 
which they employed different bases (group I), and those for which they group I: different bases 


employed the same basis (group II). group II: same basis 


4. Alice and Bob use the group I measurements to determine whether their 
measurements exhibit quantum non-local correlations. If this is the case, 
Alice and Bob use the measurements from group II to construct the 
cryptographic key, and they can be confident that this key is secure. 


We now examine this process in more detail. Depending on the nature of the 

source, entangled pairs of photons can be produced in various states. Four 

particularly important entangled states are known as Bell states: Bell states are also called 
maximally-entangled states. 


I) = (IV) qn +14 Vp): (7.13) 
J+) = (IVa Ve = Ha Es), (7.136) 


where the subscripts A and B label the two photons, and we have in mind that 
photon A is sent to Alice and photon B is sent to Bob. In Section 6.4.3, we 
considered the state |®*) in connection with Aspect’s experiments. Here, we 
shall consider a different Bell state: 


= <5 (IVa ls Ha MVP): 7.14) 


but this choice is not an essential one for the encryption method we shall describe 
— any of the four Bell states in Equations 7.13 could be used. 


ae) 


Exercise 7.7 By expressing the state vector for the entangled photons in 
Equation 7.14 as a linear combination of |Vg) and |Hg) using the relationships in 
Equations 7.10, show that Equation 7.14 may be written as 


ae id 
V2 
for any angle 0. E 


|27) = == (IVe) |Ho)p — |Ho)4 IVo)e) (7.15) 


This exercise shows that the entangled state |W) ‘looks the same from all angles’ 
in the same way that the state |®*) considered in Chapter 6 looked the same from 
all angles. 


From Equation 7.15 we can see that as long as Alice and Bob choose the same 
basis for making their measurements (i.e. the same value of 0), their results will be 
‘anti-correlated’, meaning that whenever Alice finds a photon vertically polarized 
in the @ direction, the entangled state will collapse onto its first component and, as 
a result, Bob will find a horizontally-polarized photon when his detector is 
oriented in the same direction. Similarly, if Alice finds a horizontally-polarized 
photon, Bob will find a vertically-polarized photon. This is true for any choice of 
0 made by Alice and Bob. 
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Figure 7.7 Polarizer axis 
directions used by Alice (œ) and 
Bob (8). 


P+1(a, 8) should not be 


confused with P+ defined in 
Equations 7.5. 


For a rotationally-invariant 
state we could write 

C(ai, Bj) = C(ai — Bj). 

We refrain from doing 

this here because we will 
eventually need to consider a 
non-rotationally-invariant state. 
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Now suppose that entangled pairs of photons (Equation 7.14) are being sent out 
to Alice and Bob. When each receives a photon, they perform a polarization 
measurement on it. For these measurements, Alice and Bob select their polarizer 
axes randomly from three directions: Alice selects her axis direction a randomly 
from a, = 0°, ag = 22.5° and a3 = 45°, and Bob selects his axis direction 3 
from (1 = 22.5°, B2 = 45° and G3 = —22.5°; see Figure 7.7. 


We will describe the correlations between Alice’s and Bob’s measurements in 
terms of the probabilities P+ (a, 3), describing the probability of measurement 
outcomes when Alice and Bob set their polarizer axes at angles a and (3, 
respectively. Here the first subscript refers to Alice’s measurement and the second 
refers to Bob’s measurement, and we use a ‘+’ to refer to detection of vertical 
polarization (value +1), and a ‘—’ to refer to detection of horizontal polarization 
(value —1). For example, P- (a, B) is the probability that, for a single pair of 
photons, Alice detects a vertically-polarized photon with her polarizer set at 

0 = a, and Bob detects a horizontally-polarized photon with his polarizer set at 
0 = B. If the photons remain undisturbed as they travel to Alice and Bob, they 
will be in the entangled state described by Equation 7.14 when they are detected. 
That being the case, the probabilities Pi(a, 3) can be calculated quantum 
mechanically, as shown in the following exercise. 


Exercise 7.8 Use Equations 7.12 and 7.14 to find an expression for P;+(a, 8) 
in terms of the angles a and (3. a 


It can be shown, using arguments similar to those in the exercise above, that the 
probabilities for the four possible outcomes of a measurement by Alice and Bob 
on one pair of photons are given by 


P,4(a, 8) = P__(a, 8) = 4 sin? (a — p), (7.16a) 
P,_(a, 8) = P_(a, 8) = $ cos” (a — 8). (7.16b) 
From these equations you will see that when Alice and Bob employ the same 


basis (i.e. when their polarizer axes are parallel and a = {), their results are 
perfectly anti-correlated, meaning that P_, = P__ = 0 and P,_ + P_,=1. 


This provides Alice and Bob with a means for establishing their cryptographic 
key: Alice and Bob can tell each other over the classical channel which polarizer 
axis direction they used for each polarization measurement. By taking the results 
from all measurements where they employed the same basis (group II), Alice and 
Bob will have strings of bits which are perfectly anti-correlated: Alice’s key will 
be the same as Bob’s key but with all the 1s replaced with Os, and vice versa. If 
one of Alice or Bob, by previous agreement, interchanges 1s and Os, Alice and 
Bob will have established a common key. 


But can Alice and Bob be sure that their key hasn’t been compromised by Eve 
listening in? They could proceed in the same way as with the BB84 protocol 
described previously, and compare a subset of their key bits to check for errors 
induced by Eve’s listening in and collapsing the entangled state. However, the 
entanglement of the photons offers Alice and Bob an alternative means for 
establishing the security of their key, and it turns out that the quantity that enables 
them to do this is the correlation function 


C(ai, Bi) = Ph+ (ai, Bi) + P-- (ai, Bi) 
— P4- (ai, Bj) = P_4 (ai, j). (FAT) 
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This is precisely the quantity introduced in Section 6.3.4 in connection with tests 
for hidden variables. Substituting Equation 7.16 into Equation 7.17, we find 


C(ai, bi) = sin? (ai = bi) = cos? (a; — bi) = —cos|2(a; — 6;)]- (7.18) This has the opposite sign to 
the correlation function for 
entangled photons found in 
Chapter 6 (Equation 6.30). This 
is because a different entangled 
state (|®*)) was considered in 
the earlier chapter. 


so C(ai, 8j) can take values between —1 and +1. A value of +1 indicates 

that Alice’s and Bob’s measurements are entirely correlated, meaning that 

their measurements have the same outcome. A value of —1 indicates that their 
measurements are entirely anti-correlated, having opposite outcomes. A value of 
0 indicates that there is no correlation between the results of Alice’s and Bob’s 
measurements. 


@ Give values of a; — 8; for which C (a;, 8;) = +1 and C(a;, 8j) = —1, and 
interpret your answers. 

O C(ai, Bi) = +1 when a; — B; = 7/2; this corresponds to the fact that 
orthogonal analyzers will both register vertical or both register horizontal 
polarized photons, i.e. Py, = P__ = 5. Pe => P= 
C(ai, Bi) = —1 when a; — 3; = 0; this corresponds to the fact that parallel 
analyzers will register horizontal/vertical or vertical/horizontal polarized 
photons, i.e. PŁ- = Pop= 4, Pi = P- = 0, 


In order to establish whether or not the photons arriving at Alice and Bob’s Comparing with Equation 6.20, 
detectors are described by the state in Equation 7.14, Alice and Bob communicate we have a; = 01, 61 = 62, 
over the classical channel the outcomes of some of the measurements where they a3 = 0} and 83 = 64. 
employed different bases (group I). This allows them to calculate, using the 

measured probabilities, a value of the expression 


y= Clay, bı) + C(ai, (3) + Clas, bı) g Clas, (3), (7.19) 


which you will recognize as the expression in the CHSH inequality introduced in 
Section 6.3. You will recall that the presence of local hidden variables would 
require || less than or equal to 2. This limit can only be exceeded if there are 
quantum-mechanical correlations. 


Exercise 7.9 Show, using Equation 7.18, that, for the Bell states described by 
Equation 7.14, © takes the value —2,/2 for the settings aj, «3, 61, 33 chosen by 
Alice and Bob (ay = 0°, ag = 45°, 61 = 22.5°, 63 = —22.5°). |_| 


If the photons behave as a quantum-mechanically entangled system exhibiting 

non-local correlations, as discussed in Chapter 6, then for the set of angles in the 

exercise above we expect to find |x| = 2v2. But, what happens to the value of © 

if Eve intercepts the photons? Suppose that Eve intercepts one or both of each pair 

of entangled photons and makes a polarization measurement. She could create and 

send on a new pair of entangled photons, but this would be pointless because the 

values obtained by Alice and Bob for the new entangled pair need not be the same 

as the values obtained by Eve for the intercepted pair, so Eve would learn nothing 

about Alice and Bob’s key. Instead, Eve might try to send on a pair of photons 

that are polarized according to her measurements, as she did when eavesdropping 

on the BB84 scheme. But, of course, her act of measurement will have collapsed 

the state vector and destroyed the entanglement between the photons! This means Measurement destroys 
the quantum non-local correlations between the photons are no longer present. entanglement. 


For example, suppose that Eve measures the intercepted photons in the H/V 
basis. Following her measurement, the state vector will collapse onto either 
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Note that in this case, C'(a;, Bj) 
is not a function of a; — 8j; this 
is because the state sent by Eve 
does not look the same from 
different angles. 
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|V) , |H)p or |H), |V),-. Eve will then need to prepare photons in the same state 
as she measured them (i.e. |V}; |H) p or |H), |V),) and send them to Alice and 
Bob. However, when Alice and Bob make measurements on these photons, there 
will be no correlation between their measurement results: if Eve sends the state 
|V) , |H)p to Alice and Bob, the probability that a measurement by Alice at angle 
a will yield a vertically-polarized photon will be cos? a, but it will not depend 

in any way upon what Bob measures for his photon. The quantum non-local 
correlations are gone. To see what this does to the correlation function, we must 
calculate P:+(qa, 3) for the states |V}; |H)p and |H), |V)p; let us do it for 

|V) , |H), which is the same as |VH), in positional notation. Here are two of the 
cases, with Alice measuring in the basis at angle 0 = a and Bob at angle 0 = (3: 


2 


P,.4(a,8) = |(VaVelVH)|* = Hee H),,|” = cos? asin? 8, 


P__(a,ß) = | (HoHs|VH)|? = |(HalV) 4(H mp 


= sin? a cos? 6. 


Exercise 7.10 Find expressions for P}—(«&, 3) and P_+(a, 8) for the state 
IV) 4 [H)p- = 


With these expressions for P4+(a, 3), the correlation function C(a;, 3;) in 
Equation 7.17 will no longer have the form given in Equation 7.18 that applied to 
the entangled state |Y~). We now have 


C(ai, bj) = cos? a; sin? Bi + sin? a; cos? bj 


— cos? q; cos? Bi = sin? 


a; sin? bj 
-— (cos? a; — sin? a;) (cos? 8; — sin? 6;) 
= — cos(2a;) cos(26;). (7.20) 
Substituting for the angles noted above (for Alice, ay = 0° and a3 = 45°, and for 
Bob, 81 = 22.5° and 83 = —22.5°) we find 


C(a1, 61) = ma 
C(a1, Bs) = Te 
C (a3, 61) = 
C(a3, 63) = 


@ Verify the expression for C (a1, (1). 
O C(ai, 81) = —cos(0) cos(45°) = —1//2. 


Putting together the values for C(a;, 3;), we find £ = — v2. This has 

a magnitude that is much less than the CHSH limit, whereas if Eve had not 
eavesdropped, the value of —2\/2 would have exceeded the CHSH limit of 
|| < 2. There is therefore clear evidence of eavesdropping. 


In summary, the non-local behaviour of entangled particles (in this case photons) 
can only be maintained if, during their journey, they remain undisturbed. By 
checking the correlations between measurements, Alice and Bob are able to 
determine if non-locality has been maintained. If it has, then Alice and Bob can 
be assured that Eve has learnt nothing about the key. 


7.2 Quantum cryptography 


7.2.4 Quantum cryptography in the real world 


In 2000, a dramatic experimental demonstration of quantum key distribution 
was reported by Anton Zeilinger and co-workers in Austria. These scientists 
successfully implemented the BB84 protocol and used a source of entangled 
photons to implement a cryptographic scheme similar to that described in 
Section 7.2.3. An image of the Venus of Willendorf was encrypted, transmitted 
over 360 metres, and successfully decrypted upon receipt. These systems have 
now been shown to work over many kilometres, and commercial systems for 
quantum cryptography are now available from several companies which are able 
to exchange 100 cryptographic keys per second. It seems certain that these 
systems will be the standard for secure data transmission in future years. 


Alice’s key Bob's key 


tae 


bitwise 
transformation 


bitwise 
transformation 


Ee 


original encrypted 


Figure 7.8 In 2000, researchers in Austria successfully exchanged a 49 984 bit 
cryptographic key using entangled photons. This key was used to encrypt an 
image of Venus of Willendorf which was then transmitted over a distance of 

360 m and successfully decrypted with good fidelity. 


The Venus of Willendorf is 
about 23 000 years old; she now 
resides in the Naturhistorisches 
Museum, Vienna. 


decrypted 
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Figure 7.9 A system for quantum encryption of computer data for secure 
transmission across a network. This system is capable of transmitting encrypted 
data over distances of up to 100 km, with new cryptographic keys generated at up 
to 100 times per second. The system shown in this picture is commercially 
available and is produced by the company id Quantique. 


7.3 Quantum teleportation 


In science fiction, teleportation describes a process where an object (e.g. a person) 
disappears in one location and an identical replica object appears at some remote 
location. One way of achieving this might be to take measurements of all the 
object’s characteristics, transmit these to the remote location, and then construct a 
copy of the object there. Classically, we would have to measure the positions 

and velocities of all the particles in the object, something that is ruled out by 

the uncertainty principle. Quantum-mechanically, it is practically impossible 

to measure the state of an object without changing that state. There is also a 
fundamental theorem that prevents us from producing a copy of an object to place 
beside the original; this is the ‘no-cloning theorem’ which we now introduce. 


7.3.1 The no-cloning theorem 


Having read about all the problems that Eve faces when trying to eavesdrop on 
Alice and Bob’s key exchanges in the previous sections, you may be wondering 
why Eve doesn’t make a number of identical copies of the photons she intercepts 
before making her measurements. If she were able to do this, she could send one 
copy on to Alice or Bob, and make a number of measurements on the remaining 
copies, allowing her to determine with certainty the state of the intercepted 
photon. This would then allow her to determine the cryptographic key and remain 
undetected! So it seems that we should ask the question: ‘Is it possible to make an 
exact copy (a clone) of an arbitrary unknown quantum state?’ 


Let us consider trying to clone an unknown state of a photon, which we call 
photon A. We can write a general state of photon A in the H/V basis as 


Ib), =a|V), + B|H)y. (7.21) 


We could choose any other basis rather than the H/V basis, and the arguments 
would be unchanged. In order to make a copy of photon A, we need to take 
another photon (an ‘ancillary’ photon), which we shall label B, initially in some 
state |), and place it in exactly the same state as photon A. Before our cloning 
process, the joint state vector for both photons is the product state |W), |) p. Our 
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cloning machine must act on this state to form |~), |W), and it must be able to 
do this for any state |Y} 4, i.e. for any choice of a and 8 in Equation 7.21. We 
should not do anything too violent to the initial state |7), |d),; in particular, we 
must avoid taking a measurement that causes the state of photon A to collapse 
irreversibly. This means that we must consider the normal time-development of 
states in quantum mechanics, which is governed by Schrédinger’s equation. Since 
Schr6dinger’s equation is linear, we can represent the effect of the quantum 
cloning machine by a linear operator U which needs to act according to the rule 


U [Va lop = a WB: (7.22) 
Expanding the right-hand side of Equation 7.22 using Equation 7.21 gives 


4a Ye = (a IV), +8 1E)a) (1V)p +8 Hs) 
= 0" |V)4|V)p + 67 IB); |H)p 
+ a8|V), |H)p +28 |H); |V)p- (7.23) 

For our cloning machine to work, Equation 7.22 must hold true for all w. In 
particular, it must be true if either a = 0 and 6 = 1 or if 8 = O anda = 1, 
conditions which lead respectively to 

Ul), [9s = |H)a Es U [Va lós = [Va lV) p- (7.24) 
We now expand the left-hand side of Equation 7.22: 


Ü ly)a 1p = Üa [V)a + BIH) 4) [Dn 


= a(O1V)41¢)5) +20 B) l)e) 
= a |V}; |V})g + £ |B}; |H)p- (7.25) 


In this expansion we have used the fact that operator U is linear to write the 
second line, and used Equations 7.24 to write the last equality. But wait a minute! 
The right-hand sides of Equations 7.23 and 7.25 should be equal if the cloning 
machine is able to clone any state, but they clearly are not. We are forced to the 
conclusion that it is not possible to clone an arbitrary photon state. In fact, these 
arguments can be generalized for all quantum-mechanical systems, which gives us 


The no-cloning theorem 


The linearity of operators in quantum mechanics forbids the cloning of 
quantum states. 


7.3.2 Specifying a quantum bit 


Before looking at a specific scheme that achieves quantum teleportation, we make 
some remarks that shed light on the scale of the problem. It is helpful to compare 
the way information is stored in classical and quantum physics. The smallest 
amount of information in classical (non-quantum) physics is the bit, typically 
represented by a switch being in one of two states: on or off, representing 1 or 0. 
Anyone who has recently bought a computer knows that memories are measured 
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in gigabytes (10° bytes), where one byte is 8 bits. At first sight, the quantum unit 
of information is similar. The unit of quantum information is called a quantum 
bit, or qubit. In a qubit, the information is contained in the state vector of a 
system with two orthogonal eigenvectors. An example is provided by the state in 
Equation 7.21: 


|b) = a|V) + 6|H). (7.26) 


In this case, the information is contained in the polarization states of a photon; 
since |V) and |H) form a complete set for the description of the polarization of a 
photon, Equation 7.26 is general enough to describe the polarization of any 
photon. The equivalent to ‘on’ for a classical bit might be the state with a = 1 
and (3 = 0: a vertically-polarized photon. In that case, classical ‘off’ would be the 
equivalent to the state with a = 0 and 8 = 1: a horizontally-polarized photon. 


However, a and 8 need not be equal to 1 or 0, and this makes a quantum bit very 
different from a classical bit. The classical bit represents a 0 or a 1, the least 
possible amount of information, whereas the qubit requires, in general, an infinite 
amount of information to specify it. That is one reason why it cannot be cloned, 
and is also a reason why very special techniques are required to teleport even a 
single qubit. Teleporting a person, who would be represented by a truly vast 
number of qubits, is not on the horizon. 


To see why a single qubit represents so much information, look again at 
Equation 7.26. The complex probability amplitudes œ and (3 can be expressed 
using four real numbers. However, since the state vector must be normalized, 
|a|? + ||? = 1, this condition reduces the number of independent real numbers 
by one down to three. Moreover, the state vector is independent of an overall 
phase factor e!®, reducing the number of independent numbers down to two. It 
can be shown that any normalized state of the form in Equation 7.26 can be 
represented in the form 


ly) = e~? cos(8/2) |V} + e'9/? sin(8/2) |H). (1.27) 


As @ runs from 0 to 7, and $ runs from 0 to 2T, all possible normalized and 
distinct states are represented. The angles 0 and ¢ are the polar and azimuthal 
angles representing the points on a sphere, as shown in Figure 7.10. 


‘cv 


Figure 7.10 The state of a 
qubit is defined by two angles, 

6 and ¢, representing points on a 
sphere. For photon polarization 
states, this sphere is called the 
Poincaré sphere. 
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As you can verify from Equation 7.27 and as is indicated on the figure, the point 
with 0 = 0 (the ‘North pole’) represents the state |V}, and the point with 0 = 7 
(the ‘South pole’) represents the state |H}. At either of these points, the value of ¢ 
is undefined. 


The angles @ and ¢ appearing in the parametrized form, Equation 7.27, have the 
bar on top for a good reason: 6 and ¢ are not at all related to the angle at which 
the photon is polarized. According to Equation 7.27, 0 = 7 represents not a 
photon vertically polarized at an angle 7 to the z-axis, but a horizontally-polarized 
photon. The following exercise brings this home. 


Exercise 7.1! What values of 0 and ¢ represent the states |Vg) and |Hg) for 
6=7/4? 


The parametrized form of Equation 7.27 shows that the possible states of 
polarization of a single photon can be put into one-to-one correspondence with the 
points on the surface of a sphere, and there are an infinite number of these points. 
Remember that 0 and ¢ could have any values in their allowed ranges. To specify 
any value of ĝ or ¢ requires, in principle, an infinite number of decimal places. 
Therefore teleporting a single qubit requires an infinite amount of information to 
be transferred. One simply cannot visualize writing the specification of an 
arbitrary state on paper, and carrying the paper — even for a single arbitrary qubit. 


7.3.3 A general scheme for quantum teleportation 


At first glance then, it might appear that teleportation is impossible. However, as 
we'll see below, recreating the exact quantum state of a system at a remote 
location is possible with the proviso that in the act of teleporting the quantum state 
no information about that quantum state is gained by the sender. Note that we are 
talking about teleporting information and not actual particles; the information 
may, however, be used to recreate the state of a particle. It is NOT cloning since 
the state of the original particle is destroyed by the act of teleportation. 


Suppose that Alice is asked to transmit an unknown quantum state |p) of a particle 
to Bob at a distant location, without sending the original particle, so that Bob can 
make an exact replica of |). For the purposes of illustration, we'll consider the 
case where |Y) represents the polarization state of a photon which we will label 
with a subscript 1. In general, the state of photon 1 can be represented as 


KOR =a |V) + |H), (7.28) 


with |a|? + |G|? = 1 for normalization since |H} and |V} are orthonormal. 
Clearly, if Alice attempts to learn about the state |~),, her measurement will 
collapse |x), onto her measurement basis, destroying the original state in the 
process, and leaving almost no information about it. In order to teleport the state 
|W), to Bob, an extra entangled pair of photons is required. We assume that a 
source of entangled photons is available which produces photons 2 and 3 in one of 
the Bell states mentioned earlier, the one represented by the state vector 


1 

W~)og = = (1V) IH); — IH) 21V)s). 7.29 
| p23 = Wa lHs — Ha [Vis (7.29) 
The source of photons 2 and 3 is arranged so that photon 2 is sent to Alice, and 
photon 3 is sent to Bob. This establishes the possibility of quantum non-local 
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Figure 7.11 The photon 
whose state is to be teleported 
and one of a pair of entangled 
photons enter Alice’s beam 
splitter. The second of the 
entangled photons goes 
directly to Bob, who receives 
instructions from Alice, via the 
classical route, as to how to 
process it. 
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correlations between Alice and Bob (due to the entanglement between photons 2 
and 3), but the entangled photon pair at this point does not contain any 
information about |y),. A schematic view of the scheme is shown in Figure 7.11, 
some features of which we explain shortly. 


classical channel 


Alice 


detector detector 


photon 1 photon 3 


beam 
splitter 


unknown 
photon 


entangled-photon pair 


Any measurement that Bob might make on photon 3 will not reveal any 
information about |7)),. However, if Alice is somehow able to couple together 
photons 1 and 2, in a way that will be described shortly, then a measurement on 
photons 1 and 2 can indeed facilitate the transfer of information about photon 1 to 
photon 3. 


To see how this works, we first note that the state vector of all three photons is the 
product state |Y); |Y~).3, which may be expanded using Equations 7.28 and 7.29: 


IW) 193 = IV) Yo) og 


= (I) [V)2 Hs = [V)1 [E2 VPs) 
+ (JH), IV) I) — I): 12 [V)a}; 730) 


Although it is photons 2 and 3 that are originally prepared in a particular Bell 
state (Equation 7.29), we shall need to consider the full set of four Bell states 
mentioned earlier, this time labelled for photons 1 and 2, as follows: 


W*),) = (Vh IH)» + [H), [V)2); (131a) 
>), = zV IV)» = IH); [H),).. (7.31b) 


We can then express the state vector for the three photons in Equation 7.30 in 
terms of the Bell states for photons 1 and 2 as 


Y) = 5 [IP 2 (=0 V) = 8183) +182 (=a 1V) + 818s) 


+ =) (B 1V)a + Fs) + 18+) (=6 1V); + alts) |. 
(7.32) 


To verify Equation 7.32, you can insert the expressions 7.31 into it, and show that 
it can then be rearranged to give Equation 7.30. 


Equation 7.32 can be viewed as a linear combination of the four Bell states for 
photons 1 and 2, with coefficients that happen to represent states of particle 3. If 
Alice could make a combined measurement on photons 1 and 2 that collapsed the 
state vector for those two photons onto one of the four Bell states, a so-called Bell 
state measurement, then photon 3 would collapse to the state represented by the 
corresponding factor in round brackets in Equation 7.32. For example, if Alice 
were to make a measurement that collapsed |W),.3 onto |W~),5, then photon 3 
would instantaneously be in the normalized state —a |V}; — 3|H)s. 


In summary: if Alice makes a “Bell state measurement’ on photons 1 and 2 in 
the ‘Bell state basis’ (the four vectors lists in Equations 7.31), then |V),55 

will collapse onto one of the four terms in Equation 7.32, each occurring with 
probability 1/4. Examining each of the four terms in Equation 7.32, we see that 
the state of photon 3, call it |}, following a Bell state measurement by Alice on 
photons 1 and 2, is related to the original state |); of photon 1, as summarized in 
column two of Table 7.2. 


Line one of the table says that if Alice’s measurement yields |W~),,, then Bob’s 
photon is indeed in exactly the state that photon 1 had been in, apart from the 
overall minus sign. But such an overall sign is not significant, from Principle 1b 
of Section 5.6. However, if Alice’s pair collapses onto |~*),., then the resulting 
state for particle 3, —a |V}; + 3 |H); is not just a minus sign times the original 
state of photon 1; relative phases matter even though overall phases do not. 
Somehow, all the information present in a and (3, with all their (in principle) 
infinite strings of decimal places, is present in a slightly garbled form. It is even 
more garbled if Alice finds Bell states |®~),, or |®*),, as a result of her Bell 
state measurement. 


Exercise 7.12 Write down the states of photon 3 as received by Bob in the 
cases that Alice finds |®~),, or |®*) 15. E 


We now see the point of the classical communication link from Alice to Bob. 
Alice sends a message, requiring just two classical bits, telling Bob which one 

of the four Bell states she found. Bob then operates on his photon with the 
appropriate choice of one out of four well-defined physical transformations. The 
transformation is chosen to transform the photon that he received into a photon in 
the exact quantum state of the original photon 1 received by Alice. 


The operations required by Bob can be represented as matrix transformations. 
First note that |Y}; can be represented by the matrix of its probability amplitudes, 
as in Equations 7.1 and 7.2. The four states |p), in Table 7.2 are, in order, 
represented by the matrices 


ale Cale lel [re]. 


(7.33) 


7.3 Quantum teleportation 


This is a straightforward but 
lengthy task, which you need 
not perform. 


Table 7.2 If Alice’s Bell state 
measurement places her two 
photons in the state given 

in column one, then Bob’s 
photon will collapse onto the 
corresponding state in column 
two. 


Measurement OE 
|W) 19 =a |V); — 2 |H}; 
[Eto ZQ IV)3 + 6 |H) 
|®~) 15 B |V}; + a |H) 
|B*) 19 —B|V)3 + a |H); 
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Now, Table 7.3 presents the 2 x 2 square matrices that transform the state of the 
photon (photon 3) received by Bob into the state of the original unknown photon 
(photon 1). The transformation in the first row is simply —1 times the unit matrix. 


Matrices that maintain the The transformations all maintain the normalization of any state vector on which 
normalization of any vector they operate. 
during multiplication, and have 
an inverse, are called unitary Table 7.3 Transforming |$}; into |7)5. 
matrices, and they describe 
unitary transformations. |)s Matrix form Transformation matrix 
—a 1 0 
-a |V); — 8 |H); E M 0 a 
—Q -1 0 
amem Fa] g 
B 0 1 
B V)3 + a H); i 1 0 
= 0 1 
=p V)3 +a H)3 | Q = 0 


Exercise 7.13 Show that the transformation described by the fourth 
3 1 S 
transformation matrix, Ly | , preserves the normalization of any state vector. 
You should consider the action of this matrix on the general polarization state 
y |V) + 6|H), where ||? + |6|? = 1. 


Exercise 7.14 Verify that the third transformation matrix does indeed 
transform |), as represented in the third row, into |~).. a 


Once Alice has sent her two pieces of classical information to Bob specifying 
which of the four Bell states for photons 1 and 2 resulted from her measurement, 
then Bob can use suitable equipment that applies the required transformation to 
photon 3, converting it into an exact replica of photon 1, and teleportation of the 
state of photon 1 to photon 3 is achieved! But what about Alice — does she have 
any information regarding the initial state |y), of photon 1? Her Bell state 
measurement has left photons 1 and 2 in one of the four Bell states given in 
Equations 7.31, and these equations contain no information about |~),. Thus, in 
the process of teleporting the state of photon 1, Alice has learnt nothing about this 
state; the original state |y), has been destroyed. In short, the infinite amount of 
information required to define the quantum state has been teleported to Bob, 
without Alice ever knowing that information. 


7.3.4 The first teleportation experiment 


The first successful teleportation In Figure 7.11, photons 1 and 2 are shown incident upon a beam splitter belonging 


experiment was reported by to Alice. This beam splitter plays a key role in the Bell state measurement in 
D. Bouwmeester et al. (1997) the first published teleportation measurement due to Bouwmeester et al. in 
Nature, vol. 390, page 575. Zeilinger’s group in Austria. Although successful, it was not complete, in the 


following sense: only one of the four Bell state measurements could be made. The 
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equipment shown schematically in Figure 7.11 is able to give a signal when the 
combined state of photons 1 and 2 collapses onto |Y~),,5, which will happen 25% 
of the time. When this happens, Alice sends a signal by the classical channel to 
Bob, who can then, and only then, declare that the state of photon 1 has been 
teleported to him (remember that an overall factor of —1 does not change the 
nature of the state). Although the teleportation scheme of Zeilinger’s team only 
worked 25% of the time, it was nevertheless a great achievement. 


But how is Alice’s Bell state measurement carried out? The next subsection gives 
the details. It is an important part of the story, but may be skimmed if you are 
short of time. 


7.3.5 Carrying out the Bell state measurement 
You have previously met 
Alice’s Bell state measurement is based on the peculiar properties of interference beam splitters, in the form 


between two photons falling simultaneously on a (non-polarizing) 50:50 beam of half-silvered mirrors, in 
splitter (Figure 7.12). Such a beam splitter is an optical device that splits an Chapter 1 of Book 1, and also in 
incoming beam of light in two by reflecting one half of the beam and transmitting Chapter 5 of this book. 

the other half. 


Figure 7.12 A beam splitter splits a beam of light in two by 

reflecting one half of the incoming beam and transmitting the 

other half. At the single photon level, each photon has an equal 

probability of being detected in the reflected and transmitted 

Be eer directions. A beam splitter has two input directions, or input 
‘spatial modes’, which we denote by |a) and |b), and two 

output spatial modes, which we denote by |c) and |d). For 

D | d) example, a photon in the input spatial mode |a) may be 

detected as being reflected into the output spatial mode |c) or 

transmitted into the output spatial mode |d). 


| a) |e} 


Photons 1 and 2 are initially distinguishable by virtue of the fact that they are 
coming from different places — Alice can tell which is which by virtue of their 
propagation direction in space. If Alice is to make a measurement which projects 
onto one of the four Bell states of photons 1 and 2 (Equations 7.31), she must 
arrange for the photons to be indistinguishable. In order to make photons 1 and 2 
indistinguishable, Alice must discard all photons that are not of exactly the same 
frequency, with a frequency filter. She must also find a way to make their spatial 
wave functions overlap. This can be achieved using a 50:50 beam splitter. 


So far, the only property of photons that we have specified is their polarization, 
which we represent with kets such as |H) and |V). We must now also consider the 
spatial part of the photon state, and to do this we augment the ket specifying the 
photon polarization with a ket specifying its path through space. The total state 
vector for the photon |W) is the product of a ket |v) specifying the spatial state 
and a polarization ket |y): |W) = |) |x). A beam splitter has two possible 

input directions, or input spatial modes, which we will denote by |a) and |b); 

see Figure 7.12. So, for example, the state of a horizontally-polarized photon 
propagating in the direction corresponding to the |a) spatial mode is |H)|a). 


A photon arriving at the beam splitter, in either input mode, has equal probabilities 
of being reflected or transmitted; see Figure 7.12. This figure also shows the two 


197 


Chapter 7 Quantum information 


198 


output modes, |c) and |d). It turns out that the action of the beam splitter on the 
input modes can be written as 


la) = le) + ld) (7.34) 


1 
+ i 
v2 v2 
1 i 
b = — |c) + — |d}. 7.35 
b= le + la (135) 
These equations say that a photon initially prepared in the input spatial mode |a) 
will emerge from the beam splitter in a superposition of output spatial modes |c) 
and |d), as will a photon initially in mode |b). A photon incident in either input 
mode can thus be detected with equal probability in either of the output modes |c) 
and |d). The factor i present in one term of both right-hand sides arises because 
there is a phase change when light is reflected. 


@ Ifa photon represented by the state vector |W) = Z (IV) — |H)) |b) is 
incident upon a beam splitter, write down the state vector after the beam 
splitter. 


O We use the rule in Equation 7.35 to write the state vector after the beam 
splitter as 


3 (IV) — |H)) (lc) + ild)). 


Exercise 7.15 If a horizontally-polarized photon is incident upon a beam 
splitter in spatial mode |a), write down the total state vector following the beam 
splitter. | 


Now suppose that Alice, following the scheme shown in Figure 7.11, arranges 
for photon 1 to be in input mode |a) and photon 2 to be in input mode |b). 

Each photon has the same 50:50 probability of being transmitted or reflected, 
and so four different possibilities arise (see Figure 7.13): (i) both photons are 
transmitted; (ii) photon 1 is reflected and photon 2 is transmitted; (iii) photon 1 is 
transmitted and photon 2 is reflected; (iv) both photons are reflected. Each of 
these cases occurs with equal probability. If Alice detects a photon in either 
output mode, she cannot tell whether it is photon 1 or photon 2 — the photons 
have been rendered indistinguishable by the action of the beam splitter and the 
frequency filter that eliminates photons that do not have the same frequency. 


In order to understand the effect of Alice’s measurement on photons 1 and 2, we 
need to examine what happens to the state |W), 3 in Equation 7.30. However, to 
keep the book-keeping simple, we’ll examine what happens to the general state of 
photons 1 and 2 given by the following product of two terms, one term being a 
linear combination of kets for photon 1, and the other a linear combination of kets 
for photon 2: 


(im) = (IV), + 8H); la), x (7 [V)z + 51H) lb). (7.36) 


In this equation, we have written the polarization part of the wave state vector for 
each photon as a superposition in the H/V basis, allowing any polarization state 
of the photons to be represented with suitable amplitudes a, 8, y and ô. From 
Equations 7.34 and 7.35, this state evolves into the following state after the beam 
splitter: 


la) |c) la) le) 
|) |a) |) 

ti) (ii 
La} |a) le) 
|) |a) |) |) 


(iii) (iv) 


Figure 7.13 The four possible output mode combinations for one photon in 
each of the input modes |a) and |b): (i) both photons are transmitted and one 
photon is detected in each of the output modes |c} and |d}; (ii) and (iii) one photon 
is reflected and one photon is transmitted such that both photons are detected in 
the same output mode; (iv) both photons are reflected such that the photons are 
detected in different output modes. 


zle IV), + 61H) ) (ilc) + ld)1) 


x Fa (PIV) + 81a) (le) +i lda); 737) 


Because the photons (bosons) are indistinguishable after passing through the beam 
splitter, the total two-photon state vector after the beam splitter, including both the 
spatial and polarization parts, must be symmetric with respect to exchange of the 
labels 1 and 2. This requires that we symmetrize the final state vector as 


1 
a (I) 12+1¥)o1), (7.38) 


where |Y); is formed from |W), by interchanging the 1 and 2 subscripts 
throughout: 


IV) 12 = 


|W out) —= 


War = So (e)a + BIB.) (iled2 + la) 


1 F 
x z0 Vh +5) (Ie), +ild),). (7.39) 
@ Verify explicitly that |Yout)} is symmetric under the interchange of photons 1 


and 2. 


O By construction, interchanging subscripts 1 and 2 throughout all terms in 
|Wout) simply interchanges |Y); and |W),,, leaving |Wout) unchanged. 


After substituting Equations 7.37 and 7.39 into Equation 7.38 and some tedious 
algebra, we obtain the following expression for the final state vector of the two 
photons: 


7.3 Quantum teleportation 
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Wout) = TEP) (Le, Leda +l) da) (IV) IVa + E) 1a) 
+ TP) (lo eda + ld) Io) (1V) IV) — 1B), 2) 
+ OED (le) eda + la Io) (1V) Ea + 1B), IVD) 
+ CEPO Zea loha = lea l)a) Ze (1V) I)a = I), IV). 


(7.40) 


This rather complicated equation reveals that the state vector after the beam 
splitter is a superposition of the four Bell states in Equations 7.31. Each of the 
four terms has (i) a factor specifying the spatial state, and (ii) a factor involving 
|H) and |V} terms specifying the joint polarization states — it represents one of 
the four Bell states. For the first three terms in the equation, the spatial factor is 


5 (la leda + lt Ie), (141) 


which corresponds to the photons being detected on the same side of the beam 
splitter — either both in the |c) mode, or both in the |d) mode. 


@ Explain why this last statement is true. 


O If one photon is detected in a counter corresponding to emergence from the 
beam splitter in spatial mode |c), for example, then the state function 7.41 will 
collapse onto the first term, |c), |c).. If one photon is measured to be in the 
|d} state, then the state vector will collapse onto the second term, |d); |d}. 
Hence detection of coincident photons on the same side of the beam splitter 
indicates that |Yout) has collapsed onto a state represented by one of the first 
three terms in Equation 7.40. 


On the other hand, the spatial part of the fourth term is 


5 (lt): e)z- le) I): (7.42) 


which tells us that the photons are detected coming from the beam splitter in 
different spatial modes — if photon 1 is detected in mode |c), then photon 2 will 
be detected in mode |d), and vice versa. In both cases, the output spatial modes of 
the two photons have become entangled. Equation 7.40 therefore tells us that the 
photons collapse onto the Bell state |W~),, if and only if they are detected in 
different output modes after the beam splitter, in which case the state vector 
Equation 7.40 has collapsed onto the fourth term. This will happen 25% of the 
time. 


Exercise 7.16 Photon detectors are placed on each side of a beam splitter 

in paths corresponding to spatial states |c) and |d). What can you conclude 
when two photons are detected in coincidence (a) in the same detector, and 

(b) in different detectors? E 


We can now give an overview of how the beam splitter indicated in Figure 7.11 
allowed Zeilinger and co-workers to achieve teleportation in 1997. To teleport the 
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state of photon 1, Alice directs photons 1 and 2 to the input modes of a 50:50 

beam splitter and places photon detectors in the output ports of the beam splitter. 

Whenever Alice simultaneously registers a photon arrival in both detectors, the 

state vector in Equation 7.40 must have collapsed onto the fourth term, and so the 

state vector in Equation 7.32 has collapsed onto the first term, involving the Bell 

state |U~),5. She knows with certainty that photon 3 is in precisely the same state Special relativity tells us that no 
in which photon 1 was initially prepared (up to a constant factor of —1), although information can travel faster 

she knows nothing about this state. All that remains is for Alice to communicate than light; it always turns out, as 


to Bob (via a classical communication channel) that the photon state has been here, that the quantum non-local 
teleported to his photon. It is important to realize that information has not effects do not transgress this 
propagated instantaneously between Alice and Bob, as the classical signal rule. 


will necessarily travel slower than the speed of light. The teleportation of the 
polarization state of photons was achieved in 1997 by Anton Zeilinger and 
co-workers using exactly the scheme outlined here. 


7.3.6 Unrestricted quantum teleportation 


We have explained the restricted Bell state measurement that enabled Zeilinger 
and his colleagues to achieve teleportation with a 25% strike rate. The problem 
with that arrangement is that whereas one particular Bell state results in photons 
appearing simultaneously on each side of the beam splitter, the other three Bell 
states all correspond to the same thing: two photons appearing simultaneously on 
one side or the other side. Such an apparatus cannot discriminate between 

the three Bell states that are not |W~),.. It turns out that complete Bell state 
measurements are possible, but with significantly more elaborate apparatus that is 
well beyond the scope of this chapter to describe. More recent experiments have 
achieved measurements that distinguish all the Bell states, thus enabling less 
restricted teleportation. We certainly expect interesting developments over the life 
of this course. 


Exercise 7.17 Make an outline of points that would be incorporated in an essay 
on ‘Quantum teleportation’. E 


7.4 Quantum computing 


Recent theoretical advances concerning quantum entanglement, as well as 
technical advances such as new sources of entangled photons, have stimulated an 
enormous worldwide enthusiasm for research in quantum information. This 
burgeoning field includes quantum cryptography and teleportation, and also 
quantum computing. Indeed, it is quantum computing that has generated the 
largest volume of research activity. There is not enough space in this chapter to 
give a technical introduction to this field, partly because we would need to 
establish the technical jargon (registers, gates, etc.) of computing itself. However, 
the subject is too important to ignore completely. 


It is hard today to imagine a world without the internet, mobile phones, bank cash 
machines, CDs and DVDs, and desktop computers. Until now, all of these 
technologies have relied on the idea of storing, processing and transmitting 
data as strings of 1s and Os (classical bits), and information technology has 
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stored classical bits by controlling the macroscopic properties of materials. In a 
computer chip, the bit values of 1 and 0 have been associated with electrical 
switches, involving transistors, being on or off, but they may be represented in 
other ways. For example, the two data states of a bit can be represented by two 
different orientations of a magnetic domain on a computer disk. All such methods 
of representing data behave according to classical physics, involving the concerted 
behaviour of a great many of the electrons and atoms that comprise the device. As 
a result, a classical bit is always in one state or the other; each bit can be either 1 
or 0, but not both. 
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Figure 7.14 Growth of the number of transistors for Intel computer processors 
(dots) and Moore’s law predictions for the number of transistors doubling every 
18 months, and every 24 months. 


In 1965, Gordon Moore, a co-founder of Intel, predicted that the number of 
transistors on a silicon computer chip would double roughly every 18 months. As 
Figure 7.14 shows, this prediction has held roughly correct to the present day; as a 
result, computational speed has increased by a factor of 1000 every 15 years. The 
increase in transistor density on computer chips entails a decrease in transistor 
size. By 2006, the features etched into a silicon wafer to form transistors had 
shrunk to a size of 65nm, and by 2020 will probably be below 10 nm. At this 
point, the size of the transistors will approach atomic dimensions and the number 
of electrons per circuit element will be just one or two. The operation of the logic 
elements on a chip will then no longer be determined by classical physics, 

as quantum effects will dominate. This presents a limitation to conventional 
information technology but also brings new possibilities. 


The central idea behind ‘quantum information’ is that bits are encoded in quantum 
states of individual particles rather than in macroscopic properties. For example, a 
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horizontally-polarized photon could represent a 0, and a vertically-polarized 
photon could represent a 1. As we saw earlier, however, the state of a photon 

can be an arbitrary normalized linear superposition of states corresponding to 
vertical polarization and horizontal polarization, and this means that quantum bits 
(or qubits) are much richer in information than classical bits. Moreover, reading 
the value of a classical bit does not affect the value of the bit, whereas measuring 
the value of a qubit will cause the state of the qubit irreversibly to collapse 

onto either a 1 or a 0 state, with a probability depending upon the particular 
measurement as well as on the state of the qubit before measurement. 


A quantum computer does not, of course, store information in just one qubit; there 
are many qubits in an entangled state. It turns out that n qubits represent a linear 
superposition of 2” quantum states, a vast number when n gets into the hundreds. 
Because of the quantum correlation between the entangled qubits, an operation on 
one of them would simultaneously affect the state of all of them. This opens the 
way to massively parallel processing that may allow quantum computers to 
succeed in tackling particular classes of problem that are effectively beyond the 
power of classical computers to solve in a reasonable time. 


As an example, we mention an algorithm devised by Shor that is specifically 
adapted to quantum computers. This algorithm is for finding the prime factors of 
very large numbers, something effectively beyond the capability of classical 
computers. To date, a quantum computer has actually shown only that the prime 
factors of 15 are 3 and 5, but the principle has been proven. One of the main 
difficulties in constructing a more practical quantum computer is the fact that the 
entangled linear superpositions of quantum states needed by the computer are 
generally very sensitive to external disturbances. Learning how to cope with such 
difficulties is at the forefront of current research. 
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Section 7.1 The polarization of photons is a major vehicle for the transport of 
quantum information. Polarizing beam analyzers can be used to measure the 
linear polarization of a photon relative to a given polarizer axis, with the possible 
outcomes being vertical polarization (value +1) and horizontal polarization 
(value —1). The corresponding quantum-mechanical operator P(0) has 
eigenvectors |Vg) and |Hg), with eigenvalues +1 and —1. These eigenvectors 
form a basis for any state of linear polarization of a photon. 


Section 7.2 Quantum key distribution, QKD, enables the secure transmission of 
information. The essential idea is that quantum measurements, unlike classical 
measurements, inevitably disturb the measured system. Quantum protocols for 
quantum cryptography have been devised to exploit this fact. Any eavesdropper 
would inevitably collapse the state of quantum particles bearing the transmitted 
information. The BB84 protocol does not involve entanglement, but another 
protocol, based on a proposal due to Artur Eckert, uses entanglement in an 
essential way. Both methods have been shown to work and are being developed 
for commercial applications. 


Section 7.3 Quantum teleportation is the exact transfer of the unknown quantum 
state of a particle (usually a photon) to a distant particle. At first sight, three 


This would be a suitable 
time to view the filmed 
sequence ‘Quantum 
information’ on the DVD. 
Two leading researchers in 
the field of quantum 
information discuss 
quantum cryptography, 
quantum teleportation and 
quantum computing, 

and demonstrate some 
laboratory equipment 
involved in experiments in 
these areas. 
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features of quantum mechanics appear to make this difficult. Firstly, it is 
impossible to measure an arbitrary state of a system exactly, without disturbing 

it. Secondly, the no-cloning theorem tells us that the linearity of operators in 
quantum mechanics forbids the cloning of states. Thirdly, quantum bits (or qubits) 
contain an infinite amount of information. Nevertheless, another feature of 
quantum mechanics — entanglement — can be exploited to make teleportation 
work. 


One scheme for teleporting the state of a single photon (labelled 1) involves 
creating a separate entangled pair of photons (labelled 2 and 3) and using a beam 
splitter to combine photons 1 and 2. By making a Bell state measurement, the 
distant photon 3 can be collapsed onto a state that is closely related to the original 
state of photon 1. Classical communication can inform the recipient of photon 3 
what needs to be done to it to make its state the same as the initial state of 
photon 1. The first successful teleportation experiment succeeded in teleporting 
the state of 25% of the photons. This limitation was due to the fact that only one 
of the four “Bell state measurements’ is straightforward. 


Section 7.4 A quantum computer stores information in entangled linear 
superpositions of quantum states. Although sensitive to external disturbances, 
quantum computers may allow massively parallel processing, allowing special 
classes of problem (including the factoring of large numbers) to be solved in cases 
that are beyond the powers of ordinary computers. 


Achievements from Chapter 7 


After studying this chapter you should be able to: 


7.1 Explain the meanings of the newly defined (emboldened) terms and 
symbols, and use them appropriately. 


7.2 Calculate the probability of the outcomes of measurement of photon 
polarization, given a photon polarization state vector. 


7.3 Explain the BB84 protocol for establishing secure cryptographic keys, and 
answer questions concerning its implementation. 


7.4 Give an account of the Eckert protocol whereby entangled photons can be 
employed for establishing secure cryptographic keys. 


7.5 Explain why it is not possible to clone the state of a quantum-mechanical 
system. 


7.6 Explain the general nature of quantum teleportation, including a description 
of what is and what is not transported, and a statement of its significance, 
calling upon the no-cloning theorem, and the information contained in a 
qubit. 


7.7 Outline (with a suitable diagram) the steps involved in teleporting the state 
of a photon. 


7.8 Explain the role of entanglement and Bell state measurements in quantum 
teleportation, and explain the limitations of the first successful teleportation 
experiments in terms of Bell state measurements. 


7.9 Give a very brief overview of the prospects for quantum computing. 


EEE 
Chapter 8 Mathematical toolkit 


Introduction 


This Mathematical toolkit provides additional support for some mathematical 
topics that you will meet elsewhere in this book. It deals with the concepts of 
vectors and matrices, and shows you how to carry out some basic manipulations 
that involve them. 


8.1 Vectors in ordinary space 


You will have met vectors in the context of physical quantities such as force, 
acceleration and velocity. Vectors like this, in ordinary three-dimensional space, 
may be referred to as ordinary vectors. This is to distinguish them from the more 
general abstract vectors we need to discuss later on. Here, we briefly review the 
properties of ordinary vectors. 


First, we need to define a scalar. A scalar quantity is fully described by a single 
number, together with an appropriate unit of measurement. For example, mass, 
charge and temperature are scalar quantities. Some scalars, such as mass, turn out 
to be non-negative but others, such as charge, can be positive, zero or negative. 
The magnitude of a scalar quantity is the size of the quantity ignoring any 
possible negative sign. If x is a scalar, we denote its magnitude by |x| so, with 

x = —5 m, we have |x| = 5m. Magnitudes can never be negative. 


8.1.1 Geometric interpretation of vectors 


An (ordinary) vector is a quantity that is characterized by 

both a magnitude and a direction in (ordinary) space. For 

example, the velocity of a particle is a vector because it has 

a magnitude (the particle’s speed) and a direction (the particle’s 

direction of motion). In print, vectors are usually denoted 

by bold type, e.g. v. In handwritten work, they are generally ( oe 0) 
denoted by underlining with straight or curly lines (e.g. a or 
a). The magnitude of a vector a can be written as |a|. More 
commonly, though, it is written simply as a, where the absence 


of bold print (or underlining) serves to show that a is not a vector. r 
a 


ages (A > 0) —a 
Multiplying a vector by a scalar 


We often need to multiply a vector by a scalar. Figure 8.1 shows 
how this is interpreted. Given any vector a and any scalar À, the 
product Aa is a new vector with magnitude |A|a, pointing either 
parallel or antiparallel to a. If À is positive, Aa points in the same 


ie aay : me mp eae Figure 8.1 Multiplying a vector by a scalar. 
direction as a; if À is negative, Aa points in the opposite direction. 8 vane á 


The vector (1/a) a is a vector of unit magnitude pointing in the same direction as 
a. Such a vector is called a unit vector. Any unit vector is dimensionless and has 
magnitude 1 — not 1 metre, 1 newton, or 1 anything else. 
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Figure 8.2 The triangle rule 
of vector addition. 


Figure 8.3 (a) A Cartesian 
coordinate system with three 
mutually perpendicular axes and 
three basis vectors ez, €y and 
ez; (b) in the given coordinate 
system, a vector a has 
components az, ay and az. 
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Adding two vectors 


We can also add vectors together to produce a new vector. The geometric rule for 
adding two vectors is shown in Figure 8.2. Arrows representing the vectors are 
drawn with the head of the first arrow, a, coincident with the tail of the second 
arrow, b. The arrow joining the tail of a to the head of b then represents the 
vector sum a + b. This is called the triangle rule of vector addition. Any 
number of vectors can be added together by repeated applications of this rule. In 
geometric terms, it is clear that 


la +b] < |a| + |b], 


and this is called the triangle inequality. 


Exercise 8.1 The positions of points 1 and 2 are represented by the position 
vectors rı and rə. What interpretation can be given to (a) rə — r1; (b) |ro — rı] 
and (c) (r2 + r1)/2? E 


8.1.2 Components of vectors 


Strictly speaking, vectors are independent of any coordinate system, but in 
practice it is difficult to describe or manipulate them without using a fixed 
coordinate system. We generally use a Cartesian coordinate system — a set of 
three mutually perpendicular axes meeting at an origin. Three vectors of unit 
length (called basis vectors) point along the directions of the three axes. The 
usual way of labelling a Cartesian coordinate system is shown in Figure 8.3a: the 
three axes are called the x-axis, y-axis and the z-axis, and the corresponding basis 
vectors are labelled ez, e, and ez. 


(a) 


The set of three basis vectors is said to provide a basis for ordinary 
three-dimensional space. This means that any vector a can be expressed as a 
linear combination of the three basis vectors: 

a = Azzy + ayey + azz, (8.1) 


where the coefficients in the sum are the (scalar) components of the vector. If all 
three components are equal to zero, the vector is called the zero vector, 0. 


8.1 Vectors in ordinary space 


The components are defined by dropping perpendiculars onto the axes, as in 
Figure 8.3b. If the vector a has magnitude a and points in a direction that makes 
an angle 0, with the positive x-direction, the x-component of a is given by 


ax = acos@, where 0 < 6, <7, (8.2) 
with similar definitions for a, and a;. 


Vector operations are simply expressed in terms of components. To multiply a 
vector by a scalar, A, we multiply each of its components by A: 


Aa = (Adz) €r + (Ady) ey + (Aaz) ez. 
To add or subtract two vectors, we add or subtract their components: 


a+b = (az + bres + (ay + by)ey + (az + bz)ez, 


a—b= (az — br)ez + (dy E by)ey + (az — bz)ez, 


and, more generally, any linear combination of vectors involves a similar linear 
combination of components: 


àa + ub = (Aag + pbs ez + (Aay + uby)ey + (Aaz + ubz)ez. 


Exercise 8.2 Given that a = ez + 3ey and b = 5e, — Tey, find 3a + 2b. E 


8.1.3 Scalar products of vectors 


The scalar product (or dot product) of two vectors a and b, is defined by 
a+b = arbor + ayby + azbz, (8.3) 


where az, ay and az, and b,,b, and b,, are the components of the vectors in a 
given Cartesian coordinate system. It can be shown that the right-hand side of 
Equation 8.3 is independent of the orientation of the coordinate system. Let us 
temporarily choose a coordinate system whose x-axis is aligned with the vector a 
so that ay = a; = 0. In this special coordinate system, Equations 8.2 and 8.3 give 


a- b = a,b, = (acos0) x (bcos 8) = ab cos 9, (8.4) 


where a and b are the magnitudes of a and b and @ is the angle between their an b 

directions, which is taken to lie in the range 0 < 0 < r (Figure 8.4). Now, the 

extreme right-hand side of Equation 8.4 involves only quantities a, b and 0 that do 

not depend of the choice of coordinate system, so the formula Figure 8.4 The angle 0 
between the directions of a and 


a- b = abcos0 (8:5) b is taken to be in the range 
provides an alternative definition of the scalar product, valid in any coordinate OSULT: 
system. 


In the special case where b = a, we have 0 = 0, so a : a = a’. It follows that the 
magnitude of any vector a is given by 


7 = 3 This formula can be thought of 

a= va:a = az Tajt az, (8.6) as a three-dimensional version 

In particular, we can say that n = Nngez + Nyey + nzez is a unit vector if of Pythagoras’ theorem. 
n-n=n +n +n =1. (8.7) 
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Figure 8.5 The x-component 
of a vector a is given by 

äp = acos) =e,-a. In 
geometric terms, this is found by 
projecting a onto the x-axis. 
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Vectors satisfying this condition are said to be normalized. 


If two vectors are perpendicular, the angle 0 between their directions is 7/2 
radians, and a+ b = abcos(7/2) = 0. So two vectors pointing in perpendicular 
directions obey 


a+b = agby + ayby + azb; = 0. (8.8) 


Vectors satisfying this condition are said to be orthogonal. By definition, the zero 
vector is orthogonal to any other vector. The three basis vectors ez, €y and e, are 
said to be orthonormal because each is normalized and each pair of these vectors 
is orthogonal. 


Because the basis vectors are orthonormal, taking the scalar product of a basis 
vector €y with any vector a gives 


era = ez (azez + ayey + azez) 
= Ag€z * €g + Ayer * €y + Azez + €z = Ag. (8.9) 
More generally, any component of the vector can be found by taking its scalar 


product with a basis vector. In geometric terms, this is interpreted as a projection 
onto the corresponding coordinate axis (Figure 8.5). 


The scalar product has all the properties you would expect of a product. For 
example, if a = b, and c is any vector, you can take the scalar product on both 
sides to form a valid scalar equation c - a = c- b. Moreover, 


a-b=b.a, (8.10) 

a-(b+c)=a-b+a-c, (8.11) 

a+ (Ab) = X(a-b), (8.12) 
and 

a-a>0. (8.13) 


Because cos? 0 < 1, Equations 8.5 and 8.6 also lead to the inequality 
|a- b|? < (a - a) (b - b), 

a result known as the Cauchy-Schwarz inequality. 

Exercise 8.3 Show that the vectors a = 0.6 ey + 0.8 e, and 


b = —0.8 ez + 0.6 ey are normalized and orthogonal to one another. Is the 
Cauchy—Schwarz inequality satisfied in this case? 


Exercise 8.4 Two vectors satisfy a - b = —ab. What are their relative 
directions? El 


8.1.4 Vector products 


The scalar product takes two vectors, a and b, and produces a scalar, a - b. 
However, we can also multiply two vectors to produce another vector, written as 
a X b and called the vector product (or cross product) of a and b. Vector 


8.1 Vectors in ordinary space 


products are not as important in this course as scalar products, but they have an 
important role in Chapter 2 in the definition of angular momentum. 


The vector product of a and b is a vector quantity defined by 
ax Dp E= (i @,0,)e, 1G), 0,0, )e, 1 (G,b,, 0,0, )e.., (8.14) 


where the components of the vectors are taken in a (right-handed) Cartesian 
coordinate system. 


There is the strong pattern in Equation 8.14. The x-component of the vector 
product is the difference of two terms. The first term is the y-component of 
the first vector times the z-component of the second vector; the second term 
takes these components in the opposite order. The y- and z-components of 
the vector product follow a similar pattern, based on the cyclic permutation 
x — y— z — x. This pattern can also be represented by a determinant: 


Gr &y Ee, 


(8.15) Determinants are discussed 
further in Section 8.3.5. 


aXb=l|az dy ale 


by by by 
which can be expanded to give 
Ay az Az Az Gy Qy 
axb= er — €y F ez, 
by bz by bz by by 


and Equation 8.15 is recovered when we expand the 2 x 2 determinants. 


Exercise 8.5 Find the vector product a X b, fora = 3e, + 4e, and 


b = —4e, + dey. | 
An equivalent definition of the vector product a X b is that it is a vector of 
magnitude 

la x b| = absin9, (8.16) 


where a and b are the magnitudes of a and b and @ is the smaller 
of the angles between their directions, which lies in the range 

0 < 0 < v. The direction of a X b is given by the right-hand 
rule shown in Figure 8.6: point the fingers of your right hand 

in the direction of the first vector in the product, a, and bend them 
(rotating your wrist if necessary) in the direction of the second 
vector, b. The vector product a X b is then perpendicular to both 
a and b, in the sense indicated by your outstretched right thumb. 


The vector product has many of the properties you would expect 
of a product. For example, 


aX (b+c)=axXb+aXc 


and 


a x (Ab) = (Aa) X b = A(a x b). 


However, it is important to note that the order of the vectors 


in a vector product is significant. As Equation 8.14 shows, Fee OG Wane henc tandncletotin 


aX b=~—bxXa. the direction of a vector product. 
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The vector product of a vector with itself is equal to the zero vector: 
axa=0O. 


and, more generally, the vector product a X b = 0 if a and b are collinear (i.e. 
are parallel or antiparallel). 


Exercise 8.6 A vector a is directed horizontally to the South and vector b is 
directed horizontally to the West; what is the direction of a X b? a 


8.2 Abstract vector spaces 


In this book, our main interest in vectors lies in generalizations beyond ordinary 
three-dimensional space. In Chapter 1 you will see that wave functions can be 
represented as vectors with complex components in a space with infinitely-many 
dimensions. Some echoes of this have filtered down into the language of wave 
mechanics: for example, we talk of two functions as being normalized and 
orthogonal to one another. Moreover, in Chapter 3 you will see that the spin of an 
electron can be represented by a complex vector in a two-dimensional space. 


8.2.1 First steps towards generalization 


It is important to use appropriate notation. First, we shall abandon the labels x, y 
and z used to identify axes and components in ordinary space. In a space of many 
dimensions, it is much more sensible to use the labels 1, 2, 3, ..., which can be 
continued indefinitely. 


We shall also denote abstract vectors in a different way. This is to avoid confusion 
with ordinary vectors, and to use a notation that will be helpful in quantum 
mechanics. Instead of using bold print to indicate a vector, as in a, we shall place 
a non-bold symbol in an angular bracket, as in |a). Using this notation, the natural 
extension of Equation 8.1 is 


la) = ayle1) + azļe2) +... = X` aisles), (8.17) 
i 
where |a) is a vector with components a1, a2 . . ., expressed as a linear 
combination of orthonormal basis vectors |e1}), |e2}, .... Equation 8.17 makes two 


important generalizations: 


1. There can be any number of orthonormal basis vectors |e;) — perhaps an 
infinite number of them, in which case the right-hand side of Equation 8.17 
is an infinite sum. 


2. The components a; can be complex (rather than real) numbers. 
Apart from these generalizations, the algebraic properties of vectors remain much 


the same as before. For example, vectors can added together or multiplied by 
scalars, just as you would expect. We have 


a) + |b) = X (ai + bi)le:) (8.18) 


a 
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and 
Ala) = X (Aai)|e%). (8.19) 
i 
We shall also introduce a new notation for scalar products. Rather than writing 


a» b, we shall write (a|b). We shall also make a slight generalization in the 
definition of a scalar product. For ordinary vectors, we defined 


a- b = abı + agb2 + a3b3. (Eqn 8.3) 


Now we are dealing with vectors that can have an arbitrary number of complex 
components, and it is appropriate to define the scalar product of |a) and |b) as 
follows: 


(ajb) = ajbi + ažb2 +... = ťa (8.20) 


where the components of the first vector, |a}, are complex conjugated. This makes 
no difference to vectors with real components, but it matters in general. It means 
that the ordering of vectors in the scalar product is significant: 


(bla) = (ald)” (8.21) 


The reason for using the definition given in Equation 8.20 is that it ensures that 
the scalar product of |a) with itself is 


(ala) = daa =) jaf, (8.22) 


(3 


which is the sum of real non-negative terms. It therefore follows that 
(aļa) > 0, (8.23) 


which is the analogue of Equation 8.13. This allows us to interpret \/(a|a) as the 
real, non-negative magnitude of the vector |a). 


A vector |a) with (ala) = 1 is said to be normalized, and two vectors |a) and 
|b) with (a|b) = 0 are said to be orthogonal. The basis vectors |e;) are both 
normalized and orthogonal (we say that they are orthonormal), so 


(eilej) = diy, (8.24) 
where ĝ;j is the usual Kronecker delta symbol. 
The advantage of expressing a given vector in terms of a linear sum of 
orthonormal basis vectors is that the coefficients in the sum (the components of 


the vector) are easily found by taking scalar products. For example, taking the 
scalar product of both sides of Equation 8.17 with |e;) gives 


(e;|a) = S| ai(e;|ei) = So adi = üj, 
i i 

so the component a; is given by the scalar product (e;|a). This generalizes 

Equation 8.9 to any vector space. 


You might wonder what all this has to do with ‘the real world’. No-one has ever 
seen 100 vectors pointing in 100 orthogonal directions, and we have not begun to 
explain what this might mean. Fortunately, there is a more rigorous way of 
proceeding; this will be sketched in the next two subsections. 
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8.2.2 Vector spaces 


Doctor Johnson, in the preface to his Dictionary, pointed out that: ‘to explain, 
requires the use of terms less abstruse than that which is to be explained, and such 
terms cannot always be found ... for the easiest word, whatever it may be, cannot 
be translated into one easier’. 


This creates a difficulty for mathematics, which needs precise definitions. To get 
around this difficulty, mathematicians focus on the rules governing ways objects 
combine with one another. Using the properties of ordinary vectors as a guide, 
mathematicians have drawn up a list of rules that distil the essence of ‘behaving 
like a vector’. Then, anything that obeys these rules is taken — by definition — to 
be a vector. This way of defining a vector is unambiguous and leads to fruitful 
generalizations. 


Here is the definitive list of vector properties, using the |a) notation introduced 
above; a and 3 are arbitrary scalars. 


Box | Defining properties of vectors 


1. Any two vectors can be added, and the result is a vector; this operation 
obeys the rules: 


la) + |b) = |b) + |a) 
(la) + |b)) + lc) = la} + (lb) + Ie)) 


2. There is a zero vector, denoted by |0), such that 
ja) + |0) = |a) for any vector |a) 


3. Any vector can be multiplied by a scalar, and the result is a vector; this 
operation obeys the rules: 


a(Gla)) = (aß)la) 
(a + b) la) = ala) + Bla) 
a(|a) + |b)) = ala) + ad) 


4. Multiplication of any vector |a) by 0 and 1 give 
0 |a) = |0} 
o = le 


You need not commit the above rules to memory, but you should be aware that a 
clear-cut set of rules exists. The main point is that any objects obeying these rules 
are classified as being (abstract) vectors, and any set of vectors generated by 
applying these rules exhaustively is called a vector space. So, a position vector is 
a vector, and the set of all the vectors that describe positions in three-dimensional 
space is a vector space. 


However, there are many different vector spaces. One example is provided the set 


8.2 Abstract vector spaces 


of all cubic polynomials in x, i.e. expressions of the form 
a(x) = ag + ax + aga? + agr’, (8.25) 


where ag, ..., @3 are complex constants (possibly zero) and zx is a real variable. If 
we multiply a(x) by a scalar A, we obtain another cubic polynomial: 


da(x) = (Aag) + (Agi) 2 + (Age) 2? + (Aa3) 2°. 
Moreover, if b(a) = bo + bız + box? + bax’, then 

a(x) + b(a) = (ao + bo) + (ay + bı)z + (az + be) x? + (a3 + b3) 2°, 
so adding any two cubic polynomial produces another cubic polynomial. 


In fact, if you go through all the properties in Box 1, you will find that cubic 
polynomials possess all the defining properties of vectors. Consequently, although 
it may come as a bit of a surprise, cubic polynomials can be regarded as vectors in 
an abstract vector space. Because four scalars (ag, a1, a and a3) are needed 

to specify an arbitrary cubic polynomial, the corresponding vector space is 
four-dimensional. This result can be readily generalized; polynomials of order 99 
belong to a vector space of dimension 100. 


8.2.3 Inner product spaces 


Our definition of a vector, and of a vector space, did not include any notion of 
multiplying vectors together. Nevertheless, when we use vector spaces in physical 
applications, including quantum mechanics, we usually need to combine vectors 
to form scalars, so we need to define a scalar product. (The vector product of two 
vectors is much less useful and we shall not need to consider it further.) 


In an abstract vector space, the term scalar product is usually replaced by the term 
inner product. Equation 8.20 gives a rule for calculating the inner product of two 
vectors in any vector space, but stating the rule in this way is rather circular 
because the components are assumed to be defined in an orthonormal basis, and 
the concept of orthonormality itself relies on inner products (see Equation 8.24). 


We therefore adopt the same mathematical tactic as before, collecting a list of 
required properties. In the following list, |a), |b) and |c) are vectors and A is a 
scalar. 


Box 2 Defining properties of an inner product 


An inner product of two vectors is a scalar quantity that satisfies the 
following properties: 


. (bla) = (ald)”, 

- (al(1b) + lc)) = (alb) + (ale), 

. (a|(Alb)) = A(alb), where A is a scalar, 

. (ala) > 0, with the equals sign applying only if |a) = |0). 


BW N e 


These are just the properties listed in Equations 8.10 to 8.13, but written in our 
new notation, and with Property 1 now generalized to handle vectors with 


213 


Chapter 8 Mathematical toolkit 


complex components. Any way of combining vectors that satisfies these 
properties is an inner product and the corresponding vector space is then called 
an inner product space. 


In the case of the vector space of cubic polynomials considered previously, one 
way of defining an inner product is as follows: 


1 
TE f Porade (8.26) 
-1 
With this choice, the following functions turn out to be orthonormal: 
1 
€9(@) = —, 
(@) = 
3 
e1(x) = 3 xT, 


Apart from its extra factor 


a/n + Z, the function en (x) is €2(x) = i x 5 (32? — 1), 
known to mathematicians as a 
Legendre polynomial of order n. 7 1 
ea(x) = 4/5 X z6r — 3x), (8.27) 


where zx is real. That is, 


1 
(eiļe;} = f ež (x) ej(x)dz = ðij. (8.28) 
-1 

Since any cubic polynomial can be written as a linear combination of e9(z), 
€1(x), e2(x) and e3(2), these four polynomials provide an orthonormal basis for 
the vector space of cubic polynomials. 


In quantum mechanics, a particularly important vector space (called function 
space) is provided by the set of all normalizable functions, that is, functions f(x) 
for which 


f |f(x)|? dz is finite. 


=00 


A suitable inner product for the space of normalized functions is given by 


(tl) = f ” f*(a)g(2) de. (8.29) 


While it is quite a lengthy task to verify that function space is a vector space 
(having all the properties in Box 1), it is relatively easy to confirm that 
Equation 8.29 has all the properties in Box 2, and so provides a valid inner 
product. An example of an orthonormal basis in this inner product space is 
given by the set of energy eigenfunctions of a harmonic oscillator, which are 
orthonormal with respect to the inner product of Equation 8.29, and are complete 
in the sense that any reasonable function f(x) can be written as a linear 
combination of them. 


Exercise 8.7 Show that Equation 8.29 satisfies Property 1 in Box 2. 


Exercise 8.8 Show that the functions e;(t) and e2(t) in Equation 8.27 are 
orthogonal with respect to the inner product of Equation 8.26. E 
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8.3 Matrices and determinants 


8.3.1 Matrix notation 


A matrix A is a set of objects (called matrix elements) arranged in a rectangular 
pattern of rows and columns: 


re Ajo s.a Ain 
Ag, Ago ... Aon 
Ami Ama c Amn 


In the above example, the matrix has m rows and n columns, and is said to be an 
m x n (pronounced ‘m by n’) matrix. Each matrix element A;; carries two 
indices, the first labelling the row, and the second labelling the column of the 
matrix element under consideration. A useful mnemonic for this is ‘Arc’, standing 
for Aow column: In this course, matrix elements are generally scalars — real or 
complex numbers with appropriate units of measurement. 


Three different shapes of matrix will be important for our purposes: a square 
matrix has the same number of rows and columns, a row matrix has a single row, 
and a column matrix has a single column. We will generally consider matrices in 
two-dimensional situations, and so need to consider matrices of the form: 


Ai A2 A1 
ie Avo [An Ara] Ant 
2 x 2 square matrix 1 x 2 row matrix 2 x 1 column matrix 


Two matrices are said to be equal to one another if they have the same shape and 
size and all their corresponding elements are equal. 


8.3.2 Operations on matrices 


Matrices can be combined with scalars and with other matrices in a variety of 
ways. 


Multiplication by a scalar 


To multiply any matrix A by a scalar A, we simply multiply each element of the 
matrix by A. So, 


For example, 


IR E? 
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Addition of matrices 


Two matrices A and B of the same size and shape can be added together to give a 
new matrix 


C=A+B, 


where each matrix element of C is found by adding the corresponding elements of 
A and B: 


Cij = Aij + Bij. 


For example, 


3 2 5 3 -1 2 1 5 0 
etisb] e bali dB 3 
It is not possible to add two matrices of different shapes or sizes. 


Matrix multiplication 


A very important operation between two matrices A and B is that of multiplying 
them together (matrix multiplication). 


C = AB. 


This operation only makes sense if the number of columns in the first matrix is 
equal to the number of rows in the second matrix. For example, we can define the 
matrix products 


1 2) 13 1 2) |3 4 7 

3 alla ®© [3 alls 3 1)? 
but we cannot interpret 

1 2 3| }1 2 

baea o Blia 


Let us suppose that A is an m x n matrix and B is an n x p matrix, then the 
matrix elements of the product matrix C = AB are defined by 


O =S Are (8.30) 
k=l 


Remember, Cij is the element in the ith row and jth column of C. To find this 
matrix element, we go along the ith row of A and down the jth column of B, 
multiplying corresponding elements and adding the results. This pattern may be 
visualized as follows: 


e EE] 
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| Jeb Ib | 
| d= ll il 


where the « indicates a matrix element in the new matrix, C, and the arrows show 
how matrix elements in the old matrices are processed to obtain this. The index k 
in Equation 8.30 tells us how to match up elements in A and B to multiply 
together. As k increases we simultaneously go along a row of A and down a 
column of B. However, the choice of the letter k for this purpose is arbitrary — 
we could equally well have chosen /, or anything other than ¿ and j which are 
reserved for the particular matrix element of C under consideration. Indices like 
k, that are internal to expressions and do not affect the meaning of a whole 
equation, are called dummy indices. 


Matrix multiplication does not satisfy all the rules of ordinary multiplication. For 
example, if A and B are non-square matrices we may be able to form the matrix 
product AB, but be unable to define BA. For two square matrices of the same 
size, we can define both AB and BA, but the result we get may depend on the 
order of multiplication. For example, if 


a 0 0 b 
azie ‘| and B=|; Ae 


we have 
a 0 0 b 0 ab 
aca k | : l ~ p A 
0 bija 0 0 —ab 
DaS $ 4 l ‘| E > 0 


so AB Æ BA in this case; we say that the matrices A and B are non-commuting 
or that they do not commute with one another. 


, 


You should not suppose that all matrices fail to commute. For example, the matrix 


"r 


is called the 2 x 2 unit matrix. Multiplying by this matrix leaves any other 2 x 2 
matrix A unchanged, no matter which order we use for the multiplication: 


"=p Jke d-i d 
SEEE 


It is interesting to note the role played by matrix multiplication at the birth of 
quantum mechanics. In the early summer of 1925, Heisenberg spent several 
weeks on the island of Heligoland trying to reduce the effects of hay fever. In 
this secluded environment, he made the first tentative steps towards quantum 
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Bases is the plural of basis. 
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mechanics, drawing up square tables of numbers and devising rules for combining 
them. On his return to the mainland, Heisenberg showed his very promising 
results to Born who pointed out that mathematicians would call the square tables 
matrices, and call Heisenberg’s rule for combining them matrix multiplication. 
Heisenberg had never heard of matrices! At first, the non-commutativity of 
matrices alarmed Heisenberg; he wondered whether it was reasonable to describe 
physical quantities by non-commuting objects, but it soon became clear that 

this was a characteristic feature of the new quantum physics. Before long, the 
properties of matrices were being studied by physicists all over the world. 


8.3.3 Vectors as column matrices 


In an n-dimensional vector space, any vector |a) can be expressed as 
n 
la) = $ ales), 
i=1 
where the coefficients a; are the components of |a) in the orthonormal basis |e;). 
Provided we agree on the choice of basis, the set of components a; is a compact 
way of specifying the vector. The coefficients can be arranged as a column matrix, 
allowing us to say that: 
ay 


a2 
The vector |a) is represented by the column matrix 


Qn, 


We have said ‘is represented by’ because the vector |a) has the same meaning in 
all bases, while a specific column matrix represents the vector only in a specific 
basis. Nevertheless, with the choice of basis fixed, no harm is done in linking 
these two concepts with an equals sign, and that is what we shall do here. For any 
vector |a) in a two-dimensional space, we write 


o= [a 


In particular, the basis vectors, with components (a, = 1,a2 = 0) and 
(a, = 0, a2 = 1), are given by 


jr) = fo and ea) = |}. 


In this representation, everything is consistent because column matrices obey 
rules appropriate for vectors. For example, the equation 


|a) = ay|e1) + agle2) 


becomes the identity 


el =e +b [o] [a] = fe 


The inner product of two vectors |a) and |b) can also be represented as the product 
of two matrices. However, we cannot simply multiply the two column matrices 
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representing |a) and |b). This is because matrix multiplication is not defined for 

two column matrices. Instead, we convert the first column matrix |a} into a row 

matrix and take the complex conjugate of all its elements. This process of 

converting the columns of a matrix into rows and taking the complex conjugates 

of the elements is called Hermitian conjugation — it is like complex conjugation 

with the additional twist that columns are converted into rows. We use the symbol Other books indicate row and 
(a| to indicate the row matrix that is the Hermitian conjugate of the column matrix column matrices by a and 


ja). So, at, but our notation is very 
A frequently used in quantum 
if |a) = E then (a| = [aï a3]. mechanics. 


Recalling that the inner product of two vectors |a) = a1|e1) + ag|e2) and 
|b) = b;|e1) + bale2) is defined to be 


(a|b) = aïbı + a3b2, 


we now see that the equivalent matrix representation is 
by 
— * * 
(ale) = [ai a3) |p 
where we use matrix multiplication to combine the row and column matrices. 


i 


Exercise 8.9 Given that |a) = h and |b) = i 


(alb). 


| , find the inner product 


8.3.4 Operators as square matrices 


Let us consider the effect of applying an operation to a vector. A 
Think, for example, of rotating a vector, or stretching it, or 
applying some combination of rotation and stretching. Suppose 
that the initial vector is |v}, and that the result of acting on 


~ 


it with the linear operator A is a new vector 
l’) = Alp). (8.31) 


This operation is illustrated in Figure 8.7. We can write the 
initial vector as 


lv) = So ales), (8.32) 
J 


i 


where the |e;) are orthonormal basis vectors and the v; are 
the components of |v) in this basis. Substituting this expression Figure 8.7 Visualizing the effect of an 


into: Eaualion-e ot gives operator A on a vector |v). 
jo’) = Aly) = ACD? vle) = So y (âle), 833 
j j 


where the last step follows because the operator is linear. 


Now |v’) is a vector in the same vector space as |v), so it can be expanded in the 
same basis to give 


le = Xou |e;). (8.34) 
j 
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We ask: what is the relationship between the components v, and v; of the 
transformed and initial vectors? 


To answer this question, we use the familiar trick of taking the inner product 
of both sides of Equation 8.34 with one of the basis vectors, |e;). Using the 
orthonormality of the basis vectors ((e;|e;) = 4:;), we obtain 
(eilu’) = So vi leiles) = So vj big = v. 
j j 
Combining this result with Equation 8.33, we see that 


v; = leiu’) = Sy (e;|Ale;). G2) 


J 


The quantity (e;|Ale;) is a scalar that depends on the operator A and the choice of 
two basis vectors, |e;) and |e;). For reasons that will soon become apparent, it is 
convenient to denote this quantity by A;;, so we have 


Aij = (e:|Ale;). (8.36) 
Using this notation in Equation 8.35, we then obtain 
vi = `> Aggy: (8.37) 
j 
Finally, we can use the rule for matrix multiplication to express this result as: 
vi Au Ajo er Ain UL 
vs Aoi Ago Tn Aon v2 
= ; (8.38) 
vl, Ani Ano ern Ann Un 
The matrix A may also be where A is the square matrix whose matrix elements are given by Equation 8.36. 


written as A in cases where we This matrix represents the operator A, in much the same way that the column 

wish to emphasize its role as an matrices in Equation 8.38 represent the initial and final vectors |v) and |’). For 

operator acting on column this reason, the numbers A;; are often called the matrix elements of the operator 

matrices. A. Note that the square matrix and the column matrices have definite descriptions 
in any given basis, but the numbers that appear in these matrices may vary from 
basis to basis. 


As an example of these ideas, consider the operator R(0) that rotates real 
two-dimensional vectors through an anticlockwise angle 0. Figure 8.8 shows the 
Figure 8.8 The operator effect of this operation on a pair of orthonormal basis vectors. 
R(0) rotates vectors through 
an anticlockwise angle 0. 
A matrix element such as 
Ro1 = (e€g| Rex) is given by the 
projection of the rotated vector 
Rl|e1) onto the 2-axis. In the 
case shown, R12 has a negative 
value. 
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Using elementary geometry, we can see that 
Ry, = cos, Rə =sind, Rj =-—sind, and Ro = cosð. 


Hence the anticlockwise rotation matrix is 


R(6) = jee P 


sinf  cos@ 


Exercise 8.10 Use matrix methods to find the effect of a 90° anticlockwise 
rotation on 


Exercise 8.1] By considering the effect of two successive anticlockwise 
rotations through 0, obtain general formulae for cos(20) and sin(20) in terms of 
cos 0 and sin 0. m 


Hermitian matrices 


In the physics chapters of this book, you will see that most of the operators that 
are important in quantum physics have the special property of being Hermitian. 
In the context of vector spaces, this means that 


(u|Alv) = (ujAv) = (Aulv) (8.39) 


for all vectors |u) and |v). Since all inner products obey (bla) = (a\b)”, the 
Hermitian condition can also be expressed as 


lulilo) = wilu)". (8.40) 


Applying this condition in the special case where |u) = |e;) and |v) = |e;), are 
orthonormal basis vectors, we see that a Hermitian operator obeys 


(e;|Ale;) = (e|Alei)” 
so the matrix elements of a Hermitian operator obey the condition 
Aij = Ajj (8.41) 


Any matrix for which this applies is said to be a Hermitian matrix. A Hermitian matrix is 
unchanged by the operation of 
Hermitian conjugation, which 
| 2 1+ i converts rows into columns and 

1-i 3 l’ takes the complex conjugate of 
all matrix elements. 


For example, 


is a Hermitian matrix. The tell-tale signs are that the matrix elements along the 
main diagonal (Aj, and Ag) are real, and pairs of matrix elements reflected 
across the main diagonal (A;2 and A21) are complex conjugates of one another. 


Although we shall not prove it, it is possible to show that any Hermitian matrix 
acts as a Hermitian operator on vectors (that is Equation 8.40 follows from 
Equation 8.41). 
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8.3.5 The determinant of a square matrix 


Each square n x n matrix 


Ay, Aig ... Ain 
Aoi A2 ... Aon 
A= . . : 
Anı Age oes L 
has a corresponding determinant 
Ai Aig ... Ain 
Aoi Ago ... Aon 
detA =] > : : (8.42) 
Ani Ang --- Ann 


Note the subtle difference in notation: the matrix uses a square bracket, while the 
determinant uses vertical lines. However, there is a immense difference in 
meaning between these two concepts. The matrix A is an array of n? elements, 
but the determinant det A is a particular combination of these elements that 
reduces to a single scalar quantity. 


For a 2 x 2 matrix, we define 


Ait Aj 


= Aj, Ao2 — Aj2A21. 
A1 Aes 11422 12421 


Although not needed in this book, we shall briefly indicate how larger 
determinants are calculated. To do this, we need the following definition: 


The cofactor of a given matrix element A;; is found by striking out the row and 
column that contain the given element, forming the determinant of the remaining 
elements in the order that they appear in the matrix, and then multiplying the 
result by (—1)**, where i and j are the row and column numbers of the given 
matrix element. For example, the cofactor of A;; in Equation 8.42 is 


Ago A23 ... Aon 

A32 A33 ... Asn 
cof (A11) =(—1)'*"| - : . 

An2 An3 Boks Ann 


The determinant of a matrix is evaluated by selecting any complete row or 
column and taking the sum of all the elements in that row or column, multiplied 
by their corresponding cofactors. 


Applying this procedure to a 3 x 3 matrix, we see that 


ay ag Q3 
det A = bi bə b3 
Ci CQ C3 

bo b by b by b 

ay 2 3 1 3 +a 1 2 

C2 C3 Ci C3 C © 


= a1 (b2c3 = b3c2) = a2(b1c3 — b3c1) + a3(b1c2 = b2c1). 


One application of this result is the vector product of two vectors in 
three-dimensional space, which is given by 


er Gy e 
b Xc= |b: by bz|, 
Cy Cy Cy 
and can be expanded to give 
b X c = (bycz — bzCy) €x + (bees — Dats) €y + (hey — Byte) €z, 


in agreement with Equation 8.14. 


8.3.6 The inverse of a matrix 


For a square matrix A, it is sometimes possible to find a matrix A~! such that 
AAT! = ATHA =I, (8.43) 


where I is a unit matrix. In this case, A~! is called the inverse matrix of A. For 


example, if 
1 2 -1 
v= E 

because 
ian px —1||1i 1| JIi 0 
amasl FE a=b il 
a |l 1||2 =1) |1 0 
aatelt alfa Tli s 


The general rule for constructing the inverse of a 2 x 2 matrix is as follows. If 


An Ais 
A= 
Pe | 


1 a 
] then A` = 


and 


the inverse matrix AT! is given by 


Ate ‘l Ag. —Aı2 
det A |—-A2z An |` 


(8.44) 


provided that det A = A11 A22 — A124921 Æ 0. If det A = 0, the matrix A has no 
inverse. Conversely, if A has no inverse, then det A = 0. 


More generally, a square matrix A with matrix elements A;; and det A 4 0 has an 
inverse matrix A~!, with matrix elements 


(8.45) 


where cof Aj; is the cofactor of matrix element A,;, introduced in the context of 
determinants. But if det A = 0, no inverse matrix exists. 


8.3 Matrices and determinants 


Note that the order of subscripts 
on the right-hand side of this 
equation is the reverse of that on 
the left; we use cofactors of the 
transpose of A. 
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Essential skill: 
Finding the eigenvalues and 
eigenvectors of a matrix 
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8.3.7 Eigenvalues and eigenvectors 


For any square matrix A, we can set up the matrix eigenvalue equation 
AX = XX, (8.46) 


where X is a column matrix and åA is a scalar. 


Any non-zero column matrix X that satisfies this equation is called an eigenvector 
of A, and the corresponding value of A is called an eigenvalue. (We insist that X 
be non-zero, otherwise Equation 8.46 would be trivially satisfied by any value 

of À.) 


To find the eigenvectors and eigenvalues of a given n x n square matrix, we 
rearrange Equation 8.46 as follows: 


(A — ADX = 0, (8.47) 
where I is an n x n unit matrix and 0 is an n x 1 column matrix consisting 
entirely of zeros. 


Now, let us suppose, that the matrix A — AI has an inverse, (A — ADE. If this 
were so, we would be able to act with this inverse on both sides of Equation 8.47 
to obtain 


X=(A-\ 00 


However, this possibility is ruled out by our assumption that X is a non-zero 
column matrix. We therefore conclude that A — AI has no inverse, and from our 
previous discussion of inverses, this implies that 


det (A — AI) = 0. (8.48) 


This is called the characteristic equation of the matrix A. For an n x n matrix 
A, the characteristic equation is an nth-order polynomial equation in A, which has 
n (not necessarily distinct) solutions; these are the eigenvalues of A. 


Worked Example 8.1 


Find the eigenvalues and normalized eigenvectors of A = i i : 


Solution 
In this case, the characteristic equation is 


1—à 1 
det (A — AI) = =), 
1 1-r 
which gives 
ü=)" =1=0, 


so the two eigenvalues are \ = 0 and \ = 2. 


To find the corresponding eigenvectors, we substitute each eigenvalue in turn 
back into the eigenvalue equation in the form of Equation 8.47. 


8.3 Matrices and determinants 


For A = 0, we obtain 


1 1 Z| 0 
1 1 T2 g Ol 
which gives xı = —9, so the eigenvector corresponding to À = 0 is 


1 R 
XIS- a ig where a is a non-zero constant. 


For A = 2, we obtain 


—] 1 wil I0 
il —1 169) E oe 
which gives x1 = x2, so the eigenvector corresponding to À = 2 is 


1 
Xo = A where (3 is a non-zero constant. 


These eigenvectors can be normalized by taking a = 3 = 1/V2. 


Exercise 8.12 Find the eigenvalues and eigenvectors of A = ? a i 


Exercise 8.13 Find the eigenvalues and eigenvalues of A = i i f 


Exercise 8.13 shows that it is possible for an n x n square matrix to have fewer 
than n independent eigenvectors. However, the Hermitian matrices that are 
important in quantum mechanics are special in this respect. All n x n Hermitian 
matrices have the following properties, which we state without proof: 


The eigenvalues and eigenvectors of Hermitian matrices 


1. All their eigenvalues are real. 
2. Eigenvectors with different eigenvalues are orthogonal. 


3. Itis always possible to find a set of n mutually orthogonal eigenvectors, 
even if some of them share the same eigenvalue. These eigenvectors provide 
a basis for an n-dimensional vector space. 


4. Given two n x n Hermitian matrices, it is possible to find a set of n mutually 
orthogonal eigenvectors of both matrices if, and only if, the two matrices 
commute. 
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Solutions to exercises 


Ex I.I (a) The normalization condition is 
(W|W) = 1. 
(b) According to Chapter 4 of Book 1, the probability 
of measuring energy F; in a state described by the wave 
function U(x, t) is 

oo 2 
probability = i w(x) U(x, t) da} , 
—oo 


where w;(x) is the energy eigenfunction corresponding 
to the eigenvalue F;. In Dirac notation, this can be 
written as 

probability = |(7);|)|?. 


Time is left out of this equation, being determined by 
the context. In this case, the appropriate time is the 
instant ¢ of the measurement; if you wish to be precise 
about this, you can write | (w;|W),|?. 


Ex 1.2 A complex number z is real if z* = z. Let 


ai = (f\f), 22 = (flg)(glf) and z3 = (Flg) + (alf). 
Then Equation 1.21 gives 


zi = (FIFY = (ff) =a, 


23 = (fla) (glf)* = (olf) (fla) 
= (flg)(glf) = 22, 


23 = (FIY + (olf)* = (olf) + (Fla) 
= (flg) + (g| f) = z3. 


So all these quantities are real. 


Ex 1.3 Using Equations 1.22 and 1.23, we have 
(FIF) = (e'*gle!*g) = e™ e (glg) = (glg). 
Ex 1.4 From Equations 1.24 and 1.22, 
(FIF +ig) = (FIF) + (flig) 
= (FIF) +i (flg). 
From Equations 1.25 and 1.23, 
(g = iflg) = (glo) — (iflg) 
= (glg) + i (F19). 


Subtracting these two results and using the fact that 
(FIF) = (glg), we conclude that 


(FIF +ig) — (9 — iflg) = (FIF) — (glg) = 0. 


Solutions to exercises 


Ex 1.5 The vectors |a) and |c) will be orthogonal if 
(a|c) = 0. Taking the inner product of |c) = |a) + £ |b) 
with |a) gives 
(alc) = (ala) + 6 (ald). 
The requirement that (a|c) = 0 is achieved by taking 
B = —(ala)/(alb). 
Ex 1.6 (a) Given |a) = |u) + |v) and |b) = |u) — |v), 
we have 
(alb) = ((ul + (vl) (Ju) — lv)) 
= (ulu) + (vlu) — (uv) — (ole) 
=1+0-0-1=0. 
(b) Given |c) = |u) +i|v) and |d) = ilu) + |v), we 
have 
(eld) = ((u] — i (vl) Glu) + lv)) 
= i(ulu) + (vlu) + (ule) — i (ole) 
=i1+0+0-i=0. 


Ex I.7 Weare given the ket vector 
|c) = (b[b) |a) — (bla) |b), 


and we know that (b|b)* = (b|b) and (b|a)* = (ab), so 
the corresponding bra vector is 


(c| = (b|b)" (a| — (bla)* (b| 
= (b|b) (a| — (alb) (l. 
Joining these bra and ket vectors together gives 
(cle) = (lb) (al — (ab) (b1) ( (01b) la) — (bla) 10)). 


When we multiply out the round brackets, three of the 
terms differ only in sign, so cancellations occur leading 
to 


(cle) = (blb)? (ala) — (alb) (bla) (bb) 
= (blb) ( (ala) (blb) — | (alb)|?). 
We must always have (c|c) > 0, so 
(blb) (ala) (lb) — |{a8)|”) > 0. 


Assume for the moment that |b) is not the zero vector. 
Then we have (b|b) > 0 and so 


(aja) (blb) — |(alb)|° > 0, 
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Solutions to exercises 


which gives the Cauchy—Schwarz inequality 


(ala) (blb) > |(alb)|. 


If |b) is the zero vector, we have (b|b) = 0 and 
(a|b) = 0, so the Cauchy—Schwarz inequality is still 
satisfied, as the equality 0 = 0, in this case. 


Ex 1.8 Explicitly, 
f EOG f Are ade 


Ex 1.9 We have 
C 
= [ (Low + rte 34) ae =o, 


where the last step follows from the chain of reasoning 
leading to Equation 1.37. If the operator 0/Ox were 
Hermitian, we would also have 


(l) = (lan) 


Taken together, these equations imply that 


AgS roas 


for all normalizable functions f(x) and g(x). This 

is clearly not true. (For example, it is not true if 

f(z) = ze™®? and g(x) = e~**.) Hence 8/ðx cannot 
be Hermitian; it cannot represent any observable 
quantity. 


Ex 1.10 Assuming that Â and B are Hermitian, we 
have 


(FIAg) = (Af lg), 
(f|Bg) = (Bflg), 
for all normalizable f and g. So 
(f(A + B)g) = (f|Ag) + (FIBg) 
= (Â flg) + (Bflg) 
= (å + B) flg), 
as required. 


Ex l.ll For any functions f(x) and g(x), 
Equation 1.39 tells us that 


(f|ABg) = (BA flg). 
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Similarly, 
(f|[BAg) = (ABflg). 
Adding these two equations together gives 
(IAB + BA)g) = (BA + AB) f\g) 
= ((AB + BÂ) flg). 
Hence AB + BA is Hermitian. 
Since £ and ,, are both Hermitian operators and 4 is a 


Lan ACAN 
real number, we conclude that 5 (£ Py + p,X) is a 
Hermitian operator. 


Ex 1.12 When A = T, the left-hand side of 
Equation 1.46 becomes 

d+ d, c d 

—(I) = — (4| I Y) = — (8 |). 

g0 = gU) = E (Vv) 
The identity operator commutes with any other 
operator, so the right-hand side of Equation 1.46 gives 


z (Bâ) =o. 


So, in this case, the generalized Ehrenfest theorem 

tells us that d(Y|%)/dt = 0. This shows that the 
normalization of the wave function is preserved; if the 
wave function of an isolated system is normalized at any 
initial time, it will remain normalized at all future times. 


Ex 1.13 We have 


[A,B + CG] =A(B+C)-(B+Q)A 
= AB-BA+AC-CA 
= [A,B] + [Â, ĉl 


Hence [A,B + C] = [A, B] if [A, C] = 0. 


Ex 1.14 Using p, = —ih0/Oz and V(x) = V (x), 
and applying the required commutator to an arbitrary 
function f(x), gives 


Pe Tolre) = (12) Verto) 


oV 
— ih — ; 
nT f(a) 
Since this equation is true for any f(x), we are entitled 
to write it as an operator equation 
A D „oV 
[pes V(z)| = —iħ Ox’ 


T 


where the right-hand side represents the action 
of multiplying by —iħ OV/Ox. Combining this 
commutation relation with Equation 1.51, we obtain 


d(pz) a $ , OV OV 
dt ih Ox Ox /’ 


which is Ehrenfest’s second equation (Equation 1.43). 


AQ (on 
Ex 1.15 The pi H commutes with H because 
AQA ~ 
n= =i 
Hence (H?) = ft remains constant in time. We 
already know that (E) remains constant in time, so the 


uncertainty in energy, AE = 4/ (E?) — (E)?, also 
remains constant in time. 


Ex 1.16 None. The operator X = x commutes with 
the operator P, = —ih O/Oy, so the right-hand side of 
Equation 1.57 is equal to zero. 


Ex|.!7 Taking the modulus of the generalized 
Ehrenfest theorem (Equation 1.46) gives 


1A =| 5 (8) = 5 (CAA). 


Combining this with the generalized uncertainty 
principle (Equation 1.57), and recalling that the 
Hamiltonian operator F is the energy operator of the 
system, we obtain 


d(A) 
< “AAAH =~ AAAE, 
dt [Oh h 


as required. 


Ex 1.18 We have 


= AB — (A)B — (B)A + (A)(B), 
(B — (B))(A — (A)) 
= BA — (B)A —(A)B + (B)(A), 


where we have used the fact that A and B are linear 
operators to bring the constants (A) and (B) to the front 
of each term. Subtracting the above two equations gives 


[A — (A), B — (B)] = AB - BA = [A,B]. 


Finally, substituting this result into Equation 1.64, we 
obtain 


AAAB > 3|((A,B])|, 


Solutions to exercises 


which is the generalized uncertainty principle, valid for 
any observables A and B represented by the linear 
Hermitian operators A and B. 


Ex2.! In uniform circular motion, the vectors r and 
p are perpendicular to one another, so the magnitude of 
the angular momentum is L = rpsin (7/2) = mur. 
The direction of the angular momentum is 
perpendicular to both r and p, and so is perpendicular 
to the horizontal plane of motion. Using the right-hand 
rule, the direction of L is vertically downwards. 


Ex 2.2 Expanding the determinant gives 


L = ex Y = ey K + e 7 Y o] 
Py Pz Px Px Py 
so the y-component is 
w g 
Ly =— = —(TPz — ZPr) = ZPr — T 
Y Pr De (xpz Pr) Pa Pz, 


as expected. 


Ex 2.3 Provided that n Æ m, we have 


2T 1 20 : : 
* aad —imé¢ aingo 
SOOKE a e-imPeinoag 


1 QT 
wae ei(n—m)¢ do 
20 0 
1 el(n—m)o 2r 
Ž | =0, 
~ On i(n — m) r 


as required, (since (n — m) is an integer). 


Ex 2.4 Ifthe energies are proportional to I(l + 1), 
then the second excited state will be at energy 


2(2 + 1)/1(1 + 1) x 0.00256 eV = 3 x 0.00256 eV 
= 0.00768 eV, 


and the next state will be at energy 
3(3 + 1)/1(1 + 1) x 0.00256 eV = 0.01536 eV. 


These values agree with the energies shown in the 
figure and are consistent with the claim that the energy 
levels are proportional to I(l + 1), where l = 0,1,2,.... 


Ex2.5 (a) The operator L, commutes with V(r, 0) 
for exactly the same reason that it commutes with 
V(r) (see main text); the derivative with respect to ¢ 
does not affect V(r, 0) at all. It follows that (L) is 
conserved; this is reasonable because the system has 
axial symmetry around the z-axis. (However, (Ly) and 
(Ly) will not be conserved in this case). 
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Solutions to exercises 


(b) The operator L, does not commute with V(r, o) in 
general. For any function f(r, 0, ), we have 


L. V(r, 4) f(r, 9, ¢) 
ð 


= —ih Oo (V(r, Q) f(r, 0, ¢)) 

f Of. OV 
= —iħ V(r, Q) ðo = ih f(r, 0, o) Od 
7 B ; OV 
= V(r, p)Lz f(r, 0, o) —ih f(r, 0, o) ag 
This is equal to V(r, o) D f(r,0, p) only in the trivial 
case where OV /0¢ = 0, which corresponds to V being 
independent of ¢. It follows that (Lz) is not conserved. 


Ex2.6 (a) The possible kets are |3, —3), |3, —2), 

I3, -1), [3, 0), |3, 1), |3, 2) and |3, 3). 

(b) The minimum possible value of m is —4, and 

L(+ 1) = 4(4 + 1) = 20, so the appropriate eigenvalue 
equations are 


L,|4, —4) = —4ñ]4, —4) 


and 


T l4, —4) = 20h?]4, —4). 


Ex3.1 The magnitudes of the deflections are the same 
as for the previous case, so the magnitude of the spin 
component must be the same. Hence we interpret the 
result by saying that the only possible outcomes of a 
measurement of Sy are +h/2. 


Ex3.2 (a) In Figure 3.3, 0 = 0° so cos?(6/2) = 1, 
in agreement with the fact that only an upper 
component is found by the analyzer. 


In Figure 3.4, 0 = 90° so 
cos*(@/2) = cos?(45°) = (1/2)? = 1/2, 


in agreement with the fact that two components of equal 
intensity are found. 


(b) Rotating the analyzer of Figure 3.3 by 180° gives 
0 = 180° so 


cos*(9/2) = cos?(90°) = 0, 


showing that none of the beam is deflected towards the 
north pole of the inverted analyzer; all of the beam is 
deflected towards the south pole. 
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(c) Rotating the analyzer of Figure 3.4 by 180° gives 
0 = 270° so 


cos? (0/2) = cos?(135°) = (—1/ V2}? = 1/2, 
showing that equal intensities are again observed. 


Ex 3.3 The angle between successive orientations is 
1 o : 7 
- 90°. Applying the cos?(0/2) rule n times, we see that 
a fraction 

f(n) = [cos?(4 90° /2)] ” = cos?” (90° /2n) 
of the beam prepared by P4 is deflected along the 
orientation vector of A. Evaluating this result for n = 1, 
n = 2 and n = 3 gives f(1) = 0.50, f (2) = 0.73 and 
(3) = 0.81. 


Ex3.4 (a) No definite prediction can be made 

for a single atom in the given state, except that a 
measurement of S, will give either +h/2 or —h/2. The 
probability of getting +h/2 is (\/3/2)? = 3/4, and the 
probability of getting —A/2 is (1/2)? = 1/4, so the 
value +/1/2 is more likely, but the value —h/2 would 
not be at all surprising. 


(b) For a million atoms, we expect that close to 750 000 


atoms will give S, = +h/2, and the remainder will give 
S, = —h/2. 
Ex3.5 Using Equation 3.7, we have 
(A|B)* = (ajb1 + aba) 
= aibi + a2b3 = biai + b5a2 = (BIA). 
This result illustrates the fact that the inner product in 


spin space obeys similar rules to the inner product in 
function space. 


Ex3.6 Any vector |a) can be multiplied by a constant 
A to obtain |A) = Ala). To ensure that |A) is 
normalized, we require that 

1 = (AJA) = |A)?(ala), 
and this can be achieved by taking A = 1/,/(aJa). 
(a) For |a) = | tz) + | l2), the coefficients are 
a, = 1 and ag = 1, so (ala) = |a1|? + |ag|? = 2. 
A suitable normalization factor is A = 1/v2, 
and the corresponding normalized vector is 
|A) = (| tz) +| Lz)) /v2. 
(b) For |a) = | tz) + i| Lz), the coefficients are a, = 1 
and az = i, so we have (ala) = |a1|? + |a2|? = 2, 
and \ = 1/,/2 is again suitable. The corresponding 
normalized vector is |A) = (| Tz) +i] 12)) /v2. 
(c) For |a) = 5| tz) — 12| | .), the coefficients are 
a, = 5 and ag = —12, so (ala) = |a1|? + |ae|? = 169. 


We can therefore take A = 1/13, and the corresponding 
normalized vector is |A) = (5| Tz) — 12| |.)) /13. 


Ex3.7 If |C’) were a multiple of |A), the ratio of the 
coefficient of | 1+} to the coefficient of | |.) would be 
the same for both vectors. 

In |C) = (i| Tz) + | Jz)) /V2 this ratio has the value 
i/1 = i, but in |A) = (| Tz) +i] 12)) /V2 it has a 
different value, 1/i = —i. 


Ex3.8 We have 


wiey = 5 fi}=50+0=1. 
(viv) =5 [41 ji i] =a+n=1 
wiv)=5 0 fy] =5-1+0 = 


The vectors |U} and |V} are normalized and orthogonal 
to one another, so they are orthonormal. 


Ex3.9 The question tells us that 


The coefficient of | Tx) in the expansion for | 1+) is 
1/\/2. The probability of getting Sy = +h/2 is given 
by the square of the modulus of this coefficient, and so 
is equal to 1/2. 


Ex3.10 Multiplying out the matrices, 


~a fo ijfo =i Rfi o0 
ssai olli o= Flo = 


and 
po h?fo -ilfo 1 h fae 0 
si aft o=o sf 


Subtracting these two results, we obtain 


2 |0 —i 
Al. 0 a 
ihs i | = ihS,, 


Solutions to exercises 


as required. 


Ex3.1! Operating with S, on the two vectors given 
in the question, we obtain 


Salil aal ol f= aval 

"A2 li 2/2|i Of} |i 2/2 |i 

and 
Sahl zal ol hla 
”/2 |1 2/2|i OJj[1 2/2 |1" 

So these vectors are eigenvectors of Sy. with 

eigenvalues +f/2 and —h/2, respectively. 

The physical interpretation is that, when a measurement 


1 ; 
of Sy is made, —— H describes a state that is certain 


V2 
1 
to give the value +//2, and — 
g / Z 


that is certain to give the value —h/2. 


H describes a state 


Comment: Exercise 8.12 in the Mathematical toolkit 
shows that these are the only eigenvectors and 
eigenvalues of S,. 


Ex3.12 Taking the inner products of vectors using 
spinor notation, we obtain 


(In | Tn) = [cos(0/2) e= sin(6/2)| Basan 
= cos? (0/2) + sin?(/2) = 1, 


(In| In) = [-el?sin(6/2) cos(8/2)] | cos(0/2) 


= sin? (0/2) + cos?(0/2) = 1, 


—e | 


(ta | ln) = [eos(0/2) e-#sin(0/2)] | 


—e™*® cos(0/2) sin(0/2) 
+e? sin(@/2) cos(0/2) 
= 0. 
This shows that the eigenvectors are normalized and 
mutually orthogonal. 


—e'¢ a 
cos(8/2) 


Comment: The orthogonality of these eigenfunctions 
also follows from general quantum-mechanical 
principles. Because it represents an observable, the 
general spin matrix behaves as a Hermitian operator. 
This implies that different eigenvectors, corresponding 
to different eigenvalues, are orthogonal. 


Ex3.13 The first step in any exercise like this is to 
write down what 0 and ¢ are. 
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Solutions to exercises 


(a) We require | Tn) with 6 = 90° and ¢ = 0. This 
gives 


Ite) = cos45°) 1 J1 
= [sin45°] 2 [1 
(b) We require | |n) with 6 = 90° and ¢ = 90°. Noting 
that e7i7/2 = —i, we have 


fie isin45°} _ 1 fi 
wr [cos 45° | 4/2 [1] ° 
(c) We require | fn) with 6 = 60° and ¢ = 0. This 
gives 


| ta)= cos30°} 41 V3 
™ |sin30°| 2 | 1 |” 
Ex3.14 Using Equations 3.22 and 3.24, we have 
ħ | cosé e!? sind cos(6/2) 
2 |e?sin@  —cosé | le!®sin(0/2)| ° 
Multiplying out the matrices gives 
Sal ta) = ħ | cos8cos(0/2) + sin 8 sin(0/2) 
TERS g eÍ? (sin 8 cos(@/2) — cos O sin(0/2)) f 
The standard trigonometric identities 
cos(A — B) = cos Acos B + sin Asin B, 
sin( A — B) = sin Acos B — cos Asin B 


can then be used to obtain 


1 cos(8/2) | h 


Sn| Tn) = 3 lei? sin(0/2) a F| Tn), 


Bal În) = 


as required. 
Comment: A similar calculation would show that 


| In) in Equations 3.24 is an eigenvector of Sn with 
eigenvalue —h/2. 


Ex3.15 The eigenvectors | 14) and | |,,) are obtained 
by setting 0 = 7/2 and ġ = 7/2 in Equations 3.24. 
This gives 


whl] m a-i 


These vectors provide an orthonormal basis in spin 
space, so we can write 


| Tz) = au Ty) + aa| ty), 


Hence 
|t) = el te) - el Ly) 
When Sy is measured, the probability of getting +h/2 
is 
=|) =] 
u E /2 = 2’ 
and the probability of getting —h/2 is 
i? 1 
3 = 
aal = |=| ==. 
|aa| Val 73 


Ex3.16 The state in question is that represented by 
| Tz), so Equation 3.34 gives 


ae Bit) =F of? al o 


=" o Hl = 0+0) =0, 


thus the required expectation value is zero. 

An alternative method is based on Equation 3.32. 
Repeating the calculation in Exercise 3.15, we see that 
the spin-up probability pu is equal to 1/2, and the 
spin-down probability pg is also equal to 1/2. Hence 


san (Dahi) 


Ex3.17 The transition is produced by photons of 
energy Ephot = hf, where f is the frequency required. 
The separation between the two energy levels is 

AE = hw. We equate Ephot to AE to obtain 


pate we 
a m 2r 
4.26 x 10” Hz T7! x 3.00T 
_ 4.26 x 10 f SAOI amaii 
T 


a frequency in the short-wave radio part of the 
electromagnetic spectrum. 


Ex3.18 In the time-dependent spin state obtained in 
Worked Example 3.4, 


(sn) = 5 leostar/2) isiatt fE 9] ER 


2 i 0 


=e [cos(wt/2) isin(wt/2)| Fee 


wl © N| os 


(—2 cos(wt/2) sin(wt/2)) 


h 
==, sin(wt). 


| 


Comment: Combining this result with that obtained in 
Worked Example 3.4, we see that the spin components 
S, and Sy that are perpendicular to the magnetic field 
have expectation values that vary sinusoidally in 

time and are 7/2 out of phase with one another. The 
angular frequency of oscillation is equal to the Larmor 
frequency. This phenomenon is called spin precession; 
in an MRI scanner, it produces the signals that are used 
to distinguish one type of tissue from another. 


Ex3.19 The eigenvectors of the Hamiltonian matrix 
are the eigenvectors of Sy. Using Equations 3.24 with 
0 = 7/2 and ġ = 7/2, these are 


waf] = mal] 


Because the particle has y, > 0, the corresponding 
energy eigenvalues are Ey = —ħw/2 and Ea = +hw/2. 


Using the results of Exercise 3.15, the initial state can 
be written as 


1 ; 
[Ainii = | 12) = Fz Ty) il la): 


so the required time-dependent spinor is 


1 
jys (e +iwt/2 ie wt/2 ) 
|A) a | Ty) = | Ly) 
+ iwl] i —iwt/2 |i 
=" i] 2° 1 
1 [ etiwt/2 4 e—iwt/2 
~ 9 i(etivt/2 a eWiwt/2) 
_ | cos(wt/2) 
-~ |—sin(wt/2)| ` 
Ex 4.1 For simplicity we denote the x-components 


of the momenta of the two particles by pı and pə. 
The kinetic energies of the particles are p? /2m; and 
p2/2m9, and the potential energies are those given: 
¿Cx? and $C2x3. The classical Hamiltonian function 
of the system is then 
2 2 
H= 2 4 P2 
2M1 2mMə 
Hence, recalling that the momentum operator has the 


1O r2 1O n2 
+ 5012] + 53C223. 


form p,, = —ihO/Oz, the Hamiltonian operator for the 
system is 
~ i o? an 


H= 


1 2 1 2 
50121 + z223. 


2M1 Ox? 2m ôx? t 


Solutions to exercises 


The Schrödinger equation is always of the form 


ow ) _& 
a OUP) = Bains aay), 


ot 
which in this case is 
iñ OW (x1, T2, t) 
ot 
Ro o h 0 i 
= Cia? + tOr 
| 2mMı Ox? 2m ðr? Tam t ga 
x U(x1, T2, t): 
Ex4.2 (a) The expectation value (pı) is obtained by 


inserting the operator —ih 0/ðxı representing p, into 
the sandwich integral: 


(p1) =n f i w" (x1, £2) = dx; dz2, 


Tı 


where 


p(x1,%2) = ‘ 


Differentiating and then separating the integrals, we 
obtain 


1 1/2 2/92 2 J972 
e771/2a1 e772/2a3 
Ta {ag 


which is equal to 0 because the integrand in the first 
integral is an odd function, and the range of integration 
is symmetrical about xı = 0. 


(b) The expectation value of (x1 — x2)? is 


f Í Y” (x1, £2) (£1 — 22)? (21, £2) dz1 dare 


le 


The integral is a sum of three terms corresponding to 
the three terms in ae — 2z1£2 + r2. Let us consider the 


— 2£1£2 + 22) 


x e771/41 923/23 dzı dro. 


integral involving «7: 


1 S g —22 Ja? aa -z2 ja? 
= rye “1 1 dzı e 72/92 dao. 
TA{A2 J—oo zo 


Using standard integrals inside the back cover, the first 


integral is $a? m and the second is a2\/7, so 


x sai x azy T = = 5 


lh = 
Ta {ag 


233 


Solutions to exercises 


The integral involving x1 72 is a product of two integrals 
that are both integrals from —oo to oo of odd functions 
and are therefore equal to zero. Finally, the integral 
involving x3 is equal to la by an argument similar to 
that given above for J;. It follows that 
2 2 2 

(a1 = £2) ) = 5a} + $05. 

This expectation value is the quantum-mechanical 


prediction for the average value of the square of the 
separation of the particles. 


Ex4.3 (a) The single-particle eigenfunctions in 
Equations 4.18 and 4.19 are real, so we simply take the 
product of their squares: 


4 2 
probability density = T2 sin? (=) sin? ( =) : 


(b) At (x1, £2) = (L/2, L/4), the arguments of both 
sine functions in the probability density are equal to 
1/2, so the sine functions reach their maximum values, 
and so does the probability density. 


At (x1, £2) = (L/2,3L/4), one sine function is equal 
to +1 and the other is equal to —1. Since both sine 
functions are squared, this also corresponds to a 
maximum in the probability density. 


Ex4.4 (a) The spin component of particle 1 in 
the z-direction is —5h, sO Ms, = —4; likewise, for 


particle 2, Ms, = +4, so the spin state can be 


represented by the ket |—5, +3). 


(b) Following the same principles, the spin state can be 
represented by the ket |—5, —). 


Ex4.5 To see whether a given ket is an eigenvector of 
an operator, we let the operator act on the ket and see if 
we get the same ket back, multiplied by a constant. 


(a) We have 
Seal 11) = (811 11) | Ne 
(44111) 1 1)2 
= zħ| 11), 
8.2| 11) =| 1) (8:21 12) 
= | 1): (241 Ne) 
2h| 11). 


Adding these equations together gives 
(Sei +8522) | 11) = A1 11), 
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so | TT) is an eigenvector of Sa + Sus. with 
eigenvalue ñ. 


(b) Similar calculations give 


S211 11) = +4] 1), 

821111) = -4| U1), 

S22| TL) = -41 TL), 

S22| 11) = +31 11), 
So 


(S21 + S22) | 1) = (448 - 1A) | 11) = 0] 11), 

(S21 + S22) | Lt) = (- 44+ 1A) | 11) =0] 11). 
Hence 

Sea +Sea)(l T+ 110) =00 TH +111). 


Thus | T|) + | 1f) is an eigenvector of 6.4.4 S22, with 
eigenvalue 0. 


(c) The working follows that of part (b) until the last 
line. Then 


(S21 +822) (1 11) -11 = 0(1 11) -1 11). 


So | Tt!) — | 1f) is an eigenvector of S21 + S9, with 
eigenvalue 0. 


Ex4.6 (a) We consider 
Yt (w1,22) = 5 [bn(21) Ve (wa) + Ve (1) Yalea]: 
Swapping particle labels, 
ytlen a1) = a valea) veler) + vlea) Valen) 
= E [r(22) nle) + Yale) ve(21)) 


ae 
=~" (£1, 22), 
so the function is symmetric. 


(b) The normalization integral for Y+ (21, £2) is 


I= D D [Yt (a1, #2) |? dary dag 
= ; E valentne) dei f Wi (x2) Wp (22) de 


+f vx(eryeeter)anr f vilaa)n(ve) dn 


+ f "Herein f deie dg 


+ / ” k(wr) bela) da 1 7 valeeale) de] 


This can be expressed more compactly using Dirac 
bracket notation: 


I = 5[(Wnlvn) (dale) + (Yale) (Welta) 
+ (beltn) drlbe) + (Velte) nldn)]- 
However, the single-particle eigenfunctions are 


orthogonal and normalized, so (7j|;) = 6;;, allowing 
us to conclude that 


T=35(1x1+0x0+0x0+1x1J=1. 


The total probability of the two particles being 
somewhere in the whole of space is unity. 


Ex 4.7 Inthe antisymmetric function given by 
Equation 4.37, 
E 1 

Y~ (z1, £2) = = 


V2 


Since Yp (x£) = wWz(x), we can replace Yn by wz, to 
obtain 


Y (en, a2) = F [telen belea) = beler) Velen) 
=0. 


Thus the antisymmetric eigenfunction with two 
identical particles in the same spatial state is equal to 
zero everywhere. 


Ex4.8 (a) Using the single-particle eigenfunctions 
given in the question, the symmetric two-particle 
eigenfunction is 


Yt (z1, £2) = ed 


5 [o(a1) Yı (z2) + Y1 (21) bo(x2)| 


<< 


—a7 /2a? e703 /2a? 


1 
ei + £2) e 
— = (zı + £2) e7 (ti +23)/20° 


a? /T 


and the antisymmetric two-particle eigenfunction is 


Y~ (z1, £2) = lol) pı (z2) — Yı (x1) Yo(x2)| 


1 —(z7+23)/2a? 


ae _ zı) e 


(b) Because the particles do not interact with one 
another, the total Hamiltonian operator describing 
the two-particle system can be expressed as a sum 
of Hamiltonians associated with each particle: 

f = fi + fis, where H, Un(21) = En Yn(x1) and 
Ho Yn(£2) = En Un(22). 


[Yn (21) ve(r2) — ve(21) Yn(w2)]- 


Solutions to exercises 


The operator Hy, does nothing to Yn(£2), so Hy, acting 
on Y¥ (x1, £2) treats Wp, (x2) like a constant. Thus 


A, b*(21, 22) 


= a [wo(x1) yı (x2) x pı (zı) Yolz2)] 
= < [f to(x1)) Yı (a2) + (Hi wh (21))¥o(2)| 


= Ss [Eo Yolen) a (an) = E1 Ya (e1) Yolea)]. 
In a similar way, 

fizyt(z1, z2) 

= 7z [E1 vole) Va (22) + Eo va (e1) vo(e2)] 
Adding these two results together, we find that 

fi yt(z1, £2) = (Eo + E1) Y” (21, 22), 


so wt (21,22) and Y7 (x1, 2) are eigenfunctions of H, 
both with eigenvalue Ey + Ey = ($ + 3)hwo = 2ħwo. 


Ex 4.9 Starting with 
1 


zl +1 11) 
= (l 2+ Dil Da) 


we exchange the particle labels and then rearrange. This 
gives 


1 


aN Tol Da +I Lol Ti) 


(lal Met] til Ye) 
(I Tal ot! Lil Me) 


(11 +110); 


as required. 


For | 11) = | 1) | TM and | 11) = | 11 Lo, we 
simply interchange the particle labels and then re-order 
(perfectly legal!) to get the same expressions as we 
started with. 
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Solutions to exercises 


Ex 4.10 Using a method similar to that in 
Exercise 4.5, we have 


Seal WW) = (Berl Us) | Ve 
= (-4A1 i) 1 Ys 
=-4h/ 11), 

Salti =| Us (8:21 We) 


=| t) ($41 Da) 
=—h| 11). 


Adding these equations together gives 
S: 11) = (Sei +822) | LL) = Al LL), 


so | ||) is an eigenvector of S., with eigenvalue — and 
the quantum number Ms = —1. 


Ex 4.11 Identical bosons are described by a 
symmetric total wave function. Spinless particles are 
counted as having symmetric spin states, so the spatial 
wave function must be symmetric too. The only 
possibility is 

1 


v2 


Ex 4.12 For the first excited level, one particle must 
be in the ground state of the harmonic oscillator and the 
other one in the first state. Using the solution to 
Exercise 4.8, the single-particle eigenfunctions can be 
combined to form symmetric and antisymmetric 
functions: 


yr (z, £2) = 


WT (a1, £2) = —= [Vn (21) be (we) + ve(@1) Yn(z2)]. 


1 
a?./m 


where a = V ħ/mwo and wo is the classical angular 
frequency. 


OAE. 2 
(a4 ale x2)e Kies +05 )/20 i 


These must be combined with the triplet and singlet 


spin functions | 1f}, Z (| T+] Th) | ||) and 
yal 11) - | 11). 


Since the particles are fermions, their total wave 
function must be antisymmetric. We therefore combine 
w (a1, 22) with the symmetric triplet spin kets and 

w* (a1, £2) with the antisymmetric singlet spin ket. The 
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four possibilities are 
í 

— (x 

ayn) 


ae = gaje (ti teg)/20? (| ttj {1)), 


1 
(xı — x2) g eitaz)/2af lLO; 


a? /T 
1 
a? 2T 


These states all have the same energy, 

hwo + 3 hwo = 2hwp, corresponding to one particle in 
the ground state and the other in the first excited state. 
So, in the absence of any interactions between the 
particles, the first excited level of two identical fermions 
in a simple harmonic oscillator well has four-fold 
degeneracy. 


— ap) e eitea] 44), 


(ay + ag) eo @1+72)/20* (| 71) — | [1)). 


Ex 4.13 A hydrogen atom consists of a proton and an 
electron (2 fermions), so is a boson. 


A deuterium atom consists of a proton, a neutron and an 
electron (3 fermions), so is a fermion. 


A singly-ionized helium atom consists of 2 protons, 
2 neutrons and 1 electron (5 fermions), so is a fermion. 


A 738U nucleus consists of 238 protons plus neutrons 
(238 fermions), so is a boson. 


A 7U nucleus consists of 235 protons plus neutrons 
(235 fermions), so is a fermion. 


A 2350 atom consists of 235 protons plus neutrons and 
92 electrons (327 fermions), so is a fermion. 


Ex 4.14 Protons are spin-5 fermions, so they are 
described by an antisymmetric total wave function. The 
protons are in a spatially symmetric state, so they must 
be in an antisymmetric spin state, that is, the singlet 
state represented by |0,0) = 5(| tL) — | 11)). The 
two-proton system therefore has S = 0 and Mg = 0. 


Ex 4.15 No, it is incorrect. The Pauli exclusion 
principle states that there can only be two electrons in 
the same quantum state. But quantum states can be 
degenerate: different states may have the same energy. 
This is the case in a three-dimensional box, where 
different combinations of the quantum numbers Ng, ny 
and n, can give the same energy. 


Ex5.! The wave functions VY; and Y3 represent the 
same state, because V3 = iV), andi = eiT/2 isa phase 
factor (a complex number of unit modulus). The wave 


function Yə represents an entirely different state, 
because it cannot be expressed as a multiple of Yı 
or V3. 


Ex5.2 A silver atom is a boson. It has 107 or 109 
nucleons (protons or neutrons, all fermions) and 
47 electrons, an even number of fermions in all. 


Comment: In Chapters 2 and 3 of this book, a 

silver atom was used as a prime example of a spin-3 
particle. Given that spin-3 particles are fermions, 

you might wonder what is going on! The answer 

is that the Stern—Gerlach experiment measures the 
magnetic dipole moment, and this is dominated by the 
contribution of the electrons (the nuclear contribution is 
roughly a thousand times smaller). The deflection of a 
silver atom depends only on the magnetic moment of 
the electrons; many contributions cancel out, and the 
magnetic moment is essentially determined by the spin 
of a single electron in the atom — a spin-5 particle. 


Ex5.3 Using rules give in Chapter 1, we see that y p, 
is Hermitian because it is the product of two commuting 
Hermitian operators. For the same reason, ZPy is 
Hermitian, and so is —Zp,, being the product of a 
Hermitian operator and a real constant (—1). Finally, 
L= yp, — Zp, is Hermitian because it is the sum of 
two Hermitian operators. This is to be expected, 
because Ly is an observable quantity, and so must be 
represented by a linear Hermitian operator. 


Ex5.4 The probability is 
2 


RER 
= 41-1 


= #(1-i)(1+i) =5-. 


Keltai 


Ex5.5 Multiplying out the matrices gives 
gap gap 
0 =l] v2 |i 
h : 1 
ee! 


(1-1) =0. 


If pı is the probability of getting S, = +h/2, and pə is 
the probability of getting —h/2, the expectation value 
of S, is 


(Sz) = pı X (+5) + po x (-5) = Pipi — pə). 


Solutions to exercises 


We have found (S.) = 0 in the state represented by 
| ty), So it follows that pı = p2. The normalization rule 
for probability, pı + po = 1, then gives pı = p2 = 1/2. 


Ex5.6 The position measurement causes the initial 
ground-state wave function to collapse into a narrow 
wave packet that is strongly peaked around the 
measured position zo. This wave packet is a linear 
superposition of different harmonic oscillator energy 
eigenfunctions. Evidently, this linear superposition 
includes the tenth excited state, so when the energy of 
the particle is measured, there is a non-zero probability 
of getting the corresponding energy, Eio. When this 
energy value is obtained, the wave function collapses 
into the corresponding energy eigenfunction, 719(x). 
This eigenfunction evolves as the stationary state 
Piolx, t) = W10(x) ei”10"/", so a measurement of 
energy taken at a later time still gives the value Ej. 


Ex5.7 The state |W) must be normalized, so 
1 = (|v) 
= (ej (Wi) + à (Wal) (c1 [W1) + c2 |W). 


Multiplying out the brackets and using the 
orthonormality of |W) and |Y2) then gives 


1 = chez (Wy |W1) + chez (V/V) = |c |? + [ee|?. 


Ex 5.8 
bj is 


(a) The probability of getting the eigenvalue 

probability = KAD = d54|° = ĝji, 

so the probability is equal to 1 if bj = b;, and is equal 

to 0 otherwise. 

(b) The probability of getting the eigenvalue a; is 
probability = |(aj|bi) |. 


The cos?(@/2) rule was an exemplar of this situation in 
Chapter 3, with S, playing the role of B, and S, (with n 
corresponding to the polar angle 0) playing the role 

of A. 


Ex6.! The first measurement will place the second 
particle in the state | |n}, which, from Equation 6.9, is 


— sin 30°| f) + cos 30°| |). 


The probability amplitude for the particle to be found in 
the | T} state is 


(T | In) = — sin 30°(T | T) + cos 30°(T | 1) 


) 
= —sin30° = -3, 
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Solutions to exercises 


using the orthonormality of the spin kets | f?) and | |). 
The required probability is the modulus squared of the 
probability amplitude, | — 4|? = 0.25. 


Comment: It would be legitimate simply to spot that 
the amplitude for the | |) term is — sin 30°, and square 
to obtain the probability. 


Ex 6.2 From the expression for a spin state in the 
—n-direction, Equation 6.9 with @ = 0, we have 


(n | = —sin(0/2) (T | + cos(0/2) (1 |. 
Using the orthonormality of | |) and | |), we obtain the 
probability amplitude (|n | 1) = —sin(0/2). Taking 
the modulus and squaring, to get the probability, we 
obtain the desired equation. 


Comment: This result also follows from 
sin?(0/2) + cos?(@/2) = 1 and the fact that the total 
probability is 1. 


Ex 6.3 First, the ket: the two particles are in a 
singlet state, with S = 0 and mg = 0: hence 

(x x |S = 0, Ms = 0). Secondly, particle 1 is to 

be measured spin-up in the z-direction, hence 

(T *|S = 0, Ms = 0), and particle 2 is to be 
measured spin-down in direction n, so finally we have 
(T In |S =0, Mg = 0). 


Ex 6.4 With both detectors aligned in the same 
direction, we have 6 = 0 and so Equation 6.17 gives 


probability (up, down) = 5 cos”(0) = 4. 


Similarly, we find that 


probability (down, up) = 4 cos*(0) = 4. 


So the total probability of finding the particles with 
opposite spin components in a given direction is 
a+ 1. 

Ex6.5 Since 6; — 02 = 7/2, we know that 

5 cos?[(01 — 02) /2] = 4 sin? [(01 — 62) /2] = 1/4, so 
the prediction is that there will be equal numbers in 
each of the four categories. 


Comment: Alternatively, the answer follows by 
applying the fact that a singlet state has the symmetry 
property described in Section 6.2.3. Because a singlet 
state appears the same from all angles, the result 
follows since any one of the four cases is the same as 
any other case from a different perspective, so the 
numbers must be equal, i.e. 1/4. 


Ex6.6 The quantum prediction for D(0, — 62) 
is C(O, — 02) = —cos(@; — 62). Hence the 


238 


prediction is that, for these particular angles, 
£ = —3 cos 45° + cos 135° = —2 v2. 


Ex 6.7 We must verify normalization (that 

(LIL) = (R|R) = 1) and orthogonality (that 

(LIR) = (RIL) = 0). 

Remember that the bra corresponding to |R} has the 


complex conjugate of all the coefficients in |R), leading 
to 


(RIR) = 4 ((H| — i (V|) (1H) +i]V)) 
= $((H|H) + (V|V) —i(V|H) + i (H|V)). 
The first two bra-kets are unity, and the third and fourth 
are zero, by the orthonormality of |V} and |H}. Hence 


(R|R) = 1, and the same argument can be used to show 
that |L) is normalized. We also have 


(RIL) = -3 (H| — i (VI) (IH) = iI) 
= —3((H|H) — (VIV) —i(V|H) — i (H[V)) 
=0, 


so |R) and |L) are orthogonal. Note that since 
(a|b) = (bla)* for any |a) and |b), (L|R) = 0 too. 


Ex6.8 First, |VV@) is just |Vo), |Vo), etc. From 
Equations 6.23 and 6.24, the left-hand side becomes 
a [ (cos 6|V), + sin 6|H),) (cos 6|V). + sin @|H).) 
+ (— sin 6|V), + cos 6|H),) 
x (—sin 6|V), + cos 6|H),)]. 


Multiplying out, the two |V); |H}, terms 

cancel, as do the two |H); |V}, terms. Then, 

using cos? 0 + sin? @ = 1, this becomes 

A VIV) aJ |H) |H)2] = B [|VV) + |HH)]. 
Note that |H); |V}; is not the same as |V),|H)., since 
the first ket is the state of photon 1 and the second that 
of photon 2. 


Ex 6.9 (a) The ‘near’ measurement will collapse the 
entangled state vector Equation 6.28 onto the term 
|VV), so that the ‘far’ detector will receive a photon in 
the |V) state. Such a photon is certain to be stopped by 
a Polaroid sheet oriented in the x-direction. 


(b) The state will be collapsed by the ‘near’ 
measurement onto the non-entangled term with both 
photons in the |H) state. The second photon is therefore 
certain to pass through the ‘far’ Polaroid. 


(c) The photon approaching the ‘far’ Polaroid will 
certainly be in the state |V}. Since the second detector 
is oriented with its vertical axis at 45° = 7/4 radians to 


the z-axis, Malus’s law tells us that a fraction equal to 
cos?(7/4) = 0.5 will be detected on the far side of the 
‘far’ Polaroid. 


Ex 6.10 Ifthe first particle were to have a positive 
value of S, in a measurement, then the state vector 
would collapse onto the first term of Equation 6.31, for 
which S, is positive for both of the other two particles. 
Likewise, if the first particle were to have a negative 
value of S, in a measurement, the other two particles 
must also have negative S,. 


Ex 7.1 In order to show that |Vg) and |Hg) are 
orthogonal for any value of 0, we need to calculate the 
inner product (Vg|Hg): 


in 0 


= —sin@cos@+ sin @cos@ = 0. 


(VolHo) = [—sin9 cos 6] ba ] 


Since this inner product is zero for any 0, the two 
eigenvectors are orthogonal. This could also be shown 
by considering the inner product (Hg| V9). (Remember 
that (a|b) = (b|a)* always.) 

Ex 7.2 To show that |Vg) and |Hg) are normalized, we 
need to calculate (Vg|Vg) and (Hg|Ha): 

cos 4 


(Vo|Vo) = [cosð sin 6] E? 
= cos? 6 + sin? 0 = 1, 


: —sin@ 

(Ho|Hy) = [—sin@ cos 6] | oy | 
= sin? 0 + cos? 8 = 1. 

So |Vọ) and |Họ} are normalized to unity. 


Ex7.3 We use Equations 7.2 and 7.3: 


~ cos20 sin20 | |cos@ 
P(9) |Vo) = is 20 —cos J j 


cos 20 cos 0 + sin 28 sin j 


sin 26 sin 6 — cos 20 sin 0 


cos(20— 0) __ 
sin(20 — A = |Vo) 
cos 26 


sin 20 —sin 0 
sin20 — cos20 cos 0 


— cos 20 sin 0 + sin 20 cos 8 

— sin 20 sin 0 — cos 20 cos 8 
sin(20— 0) | _ 

—cos(20 — A = —|Ho). 


BY) 
— 
D 
xm 
T 
D 
ia 
I 
—a1_ ro G a mn rm 


Solutions to exercises 


Ex 7.4 A photon polarized along the z-axis is 
represented by |Y} = |V} = lo , and the probability of 


a measurement yielding |Vo) is given by 


2 
P4 (0) = |(VelV) |? = 


. 1 
[cos 0 sin 0 o 
= cos? 0. 


This result for single photons is the 
quantum-mechanical explanation of Malus’s law. 


Ex 7.5 
Alice’s basis H/V | H/V | H/V | H/V 
Alice’s sent bit 1 0 1 0 
Bob’s basis H/V | H/V D D 
Bob’s detected bit 1 0 lor0O | lorO 
Alice’s basis D D D D 
Alice’s sent bit 1 0 1 0 
Bob’s basis H/V | H/V D D 
Bob’s detected bit | 1 or 0 | 1 or O 1 0 


In the first two and last two columns, Alice and Bob 
employ the same bases and so Bob necessarily finds the 
photon in the state sent by Alice. In the other four 
cases, the bases are different and Bob will find a 0 or 1 
with equal probability. 


Ex 7.6 Start by constructing a table similar to the one 
in the previous exercise, to cover all the different 
combinations of the common basis chosen by Alice and 
Bob, and the basis chosen by Eve, as shown below. 


Alice & Bob basis | H/V | H/V | H/V | H/V 
Alice’s sent bit 1 0 1 0 
Eve’s basis H/V | H/V D D 
Bob’s detected bit 1 0 lor0 1or0 
Alice & Bob basis D D D D 
Alice’s sent bit 1 0 1 0 
Eve’s basis H/V | H/V D D 
Bob’s detected bit | 1 or 0 | 1 or 0 1 0 
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Solutions to exercises 


Through examination of this table, you should be able 
to see that, on average, Eve will choose the wrong basis 
for her measurement 50% of the time. Of those times, 
on half of the occasions, Bob will measure the wrong 
bit value. So in Bob’s resulting string, 25% of the bits 
will differ from Alice’s. The answer can also be seen, 
more simply perhaps, by noting that half the time, 

Eve chooses the ‘wrong’ basis, and on half of those 
occasions she will transmit the ‘wrong’ value to Bob. 
Hence 25% of the bits received by Bob will differ from 
what Alice sent. 


Ex 7.7 Using the relationships in Equations 7.10, 


1 
|47) = Fa |( cos 0 Vo), — sin 0 |Ho) 4) 
x (sin 6 |Vo)g + cos 0 |Ho)p) 


— (sin |Vo) , + cos 6 |Hy) 4 ) 
x (cos 4 |Vo)p — sin 8 |Hg)p) |. 


Multiplying out the terms in this equation, we obtain 


\U-) = | (cos sind [Vo) IVo)p 

v2 + cos 0 cos 0 | Vo) a |Ho}g 
— sin f sin 0 |Ho) , |Vo)g 
— sin 6 cos 0 |Hg) 4 |Ho)p) 

— (sin 0 cos 0 |Vo) 4 |Vo)g 

— sin sin 0 | Vo) a |Ho)g 
+ cos 0 cos 0 |Hg) a |Vo}g 
— cos 0 sin 0 |Ho) 4 [Ho)s)| $ 


Collecting terms, the two |Vg) 4 |Vo}pg terms cancel, 

and so do the two |Họ}) 4 |Họ}p terms. Using 

sin? 6 + cos? 6 = 1 twice, once for the |V¢) , |Ho)p 

term and once for the |Hg) , |Vo}pg term, we find that 
this equation simplifies to Equation 7.15. 


Ex7.8 P,+(a, (3) is the probability that, for a pair of 
photons, Alice finds a vertically-polarized photon 
when her analyzer is set at angle a, and Bob finds a 


vertically-polarized photon with his analyzer at angle (3. 


This probability is found by calculating the modulus 
squared of the inner product of |Va) |V a), (in which 
we temporarily restore the A and B labels before 
reverting to the positional convention) with the Bell 
state described by Equation 7.14: 


P44 (a, 8) = (Va Velt)? 
2 
VaVal (IVH) — |HV)) 


i 
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(Remember that on each side of the inner product, 

the first term refers to particle A, the second to 
particle B, so that the kets on the right could be written 
IV), |H) 5 — |H), |V)p.) Recalling from Section 6.3.3 
of Chapter 6 that we only take inner products for the 
same particle, we rewrite this equation as 


P, (a, B) = $|(ValV) a (VelH)p 
= (ValH) a (ValV)p 


where we have restored the particle labels to the inner 
products. Using the expressions in Equations 7.12 for 
these inner products, we find 


P+ (a, 8) = 


= 5 sin? (a — b). 


2 
’ 


5 |cos asin 8 — sin a cos 6|? 


Ex7.9 Substituting the values for a, = 0°, a3 = 45°, 
By = 22.5° and 33 = —22.5° into Equation 7.18 gives 
for the first term C'(a1, 31) = — cos(2(0 — 22.5°)). 
Doing the same for the other terms gives 
£ = — cos(—45°) — cos(45°) 
— cos(45°) + cos(135°) 


= — 2/2. 


Ex7.10 Following the model just given, we have 


P,_(a,ß)= | (VeHs|VH)|* = | (ValV) 4 (HalH) 5|” 


= cos? a cos? 6, 


P_4(a, 8) = | (HaVg/VH)|° = | (HalV) 4 (ValH) I? 


= sin? asin? b. 
Ex 7.11 From Equations 7.2 with 0 = 7/4, we have 
1 
Vaya) = (IV) +18), 
[Vad = 75 V 


from which we see first that both amplitudes 

are real so that ¢ = 0, and secondly that 

cos(0/2) = sin(6/2) = 1/./2 so that 8/2 = 7/4 and 
hence ð = 7/2. Similarly, 


IH, s) = = (=V) + m). 


V2 
The pair of values 6 = 7/2 and ¢ = 7 yields 
i 
= (Iv) +1) 
aq (-I¥) +P) 


which is the required form, noting that the overall factor 
of i is irrelevant to the specification of the state vector. 


Ex 7.12 From the third and fourth rows of Table 7.2 
we have in the first case 3 |V}; + a@|H)s, and in the 
second —3|V)3 + a|H)s. 


Ex 7.13 The state vector after the transformation is 
found by multiplying the initial state vector by the 
transformation matrix: 


-i ofl =- 


But |6|? + |—y|? = |y]? + |6|? = 1, so the 
normalization has been preserved. 


Ex7.14 Using the appropriate matrix multiplication, 


Blea 


and the right-hand side represents the amplitudes 
defining |~),. 


Ex 7.15 The ‘before’ state is |H) |a). We see from 
Equation 7.34 that the subsequent state is, in two 
alternative forms, 


1 1 
ql (il +10) = 
Ex 7.16 The structure of Equation 7.40 tells us that if 
(a) the two photons are detected in the same detector, 
then the overall state vector |Yout)} will collapse onto 
one of the first three terms, so that the joint polarization 
state of photons 1 and 2 will be one of the three Bell 
states |®*),. or |W*),, defined in Equations 7.31. 

On the other hand, if (b) the photons are detected 

in different detectors, then the overall state vector 

|W out) will collapse onto the fourth term, and the joint 
polarization state of photons 1 and 2 will be the Bell 
state |U~),5. 


(i IH) le) + IH) la) ). 


Ex7.17 Points would include the following. 


1. Why transfer of quantum state is non-trivial: the 
uncertainty principle, the no-cloning theorem, and 
information contained in a qubit. 

. Input photon 1 to Alice. 

. Create entangled pair, photons 2 and 3. 

. Photon 2 to Alice, photon 3 to Bob. 


. Alice makes Bell state measurement on photons 1 
and 2. 


6. This measurement projects the joint state of 
photons 1 and 2, which can be expanded as a sum 
over the four Bell states, onto one of the Bell 
states. 


nan A U N 


Solutions to exercises 


7. Each of the four terms in the overall state 
corresponds to a state of photon 3; this state could 
be transformed to the state of photon 1 by 
operating upon it by one of four transformations. 
But which one? 


8. Alice reports the results of Bell state measurement 
to Bob using classical means. 


9. Alice never learns the state of photon 1. 


10. Bob makes a transformation of the state of his 
photon depending on which of the four Bell states 
Alice measures, as communicated by Alice by 
classical means. 


11. The nature and limitations of simple Bell state 
measurement (include diagram like Figure 7.11), 
as in the implementation of Zeilinger and 
collaborators. 


12. With the scheme of Zeilinger and collaborators, 
there is successful teleportation only 25% of the 
time. 


13. What teleportation is and what it is not. (It is the 
transportation of states, not of particles; it does not 
permit cloning since the initial state of a particle is 
destroyed.) 


Ex8.l (a) rə — rı is the displacement vector from 
point 1 to point 2; (b) |r2 — rı] is the distance between 
points 1 and 2 and (c) (r2 + r1)/2 is the position vector 
of the point that is midway between points 1 and 2, on 
the line between these two points. 


Ex8.2 We have 
3a + 2b = 3(ez + 3e,) + 2(5ez — Tey) 
= (3 + 10)e, + (9 — 14)e, 


= I3ez — 5ey. 


Ex8.3 We have 
a.a = (0.6)? + (0.8)? =1, 
b- b = (—0.8)? + (0.6)? = 1, 
a+ b = (0.6) x (—0.8) + (0.8) x (0.6) = 0. 


so a is normalized, b is normalized and a is orthogonal 
to b. The Cauchy—Schwarz inequality is easily satisfied 
since |0| <1 x 1=1. 
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Solutions to exercises 


Ex 8.4 From Equation 8.5 we have 
abcos 0 = —ab. 


Hence, cos 0 = —1 and 0 = 7; the vectors point in 
opposite directions. 


Ex8.5 Using Equation 8.15, 


€s Cy € 
axb=/3 4 0 
—4 3 0 
4 0 3 0 3 4 
= — ey + ez 
3 0 —4 0 —4 3 


= l ez + 0 ey + 25 e; = 25 ez. 


Ex8.6 Point your right hand horizontally to the South 
in such a way that its fingers can bend horizontally to 
the West. Your outstretched right thumb then points 
downwards; this is the direction of a X b. 


Ex8.7 We have 


gio =( J Psa) 
~ J. f(x)g* (x) dx 
2 5 g* (x) f(a) dx = (g| f). 


Ex8.8 Since tis real, ef = e; and so we have 


I 43 ‘ee 5 
lale) = f’ yžtx LBP — 1) dt =0 


because the integrand is an odd function, and the 
integral is over a range centred on t = 0. 


Ex8.9 We have (a| = [1 —i],so 
ale a 
(alb) = [1 —i] H =i—i=0. 
Ex8.10 For 0 = 90°, the rotation matrix is 
cos 90° —sin90°} |0 —1 
cos90° | J1 0? 


RUS bee 90° 
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1]. 0 —1| |1 
so H is rotated to f 0 | H 


= ak 


Ex 8.11 Two successive rotations through 0 have the 
same effect as a single rotation through 26, so we must 
have 
cos(20) —sin(20)| _ |cos@ —sin6@| |cos@ —sin@ 
sin(20) cos(20) | |sin@ cosð | |sin@ cosð 
_ [cos?@—sin?6 —2sin 6 cos 0 
2sinĝcosð cos? 0 — sin? 0| ` 


Hence, equating corresponding matrix elements, 


cos(20) = cos? 0—sin? 0 and 


sin(20) = 2sin 0 cos 8. 


Ex 8.12 The characteristic equation is 


i — 


so à? — (—i)(i) = 0, giving À = +1. 


For \ = +1, the eigenvalue equation gives 


Pa] Ee] lol 


; : . _ 1 
so zı = —ix and a suitable eigenvector is —= 


il 


v2 


For \ = —1, the eigenvalue equation gives 


ialla- b 


} A ; . 1 
so zı = ix2 and a suitable eigenvector is —— 


il 


v2 


Ex 8.13 The characteristic equation is 


1—àÀ 0 
1 l1—à 


so the only eigenvalue is A = 1. The corresponding 
eigenvector has components xı and xə satisfying 


iola bol 


which gives x; = 0. So the corresponding eigenvector 


is q H , where a can be chosen to be equal to 1 for 


normalization. 
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Complex numbers 


z=at+iy=re? 
z+2* 
R = 
e(z) = = 
e? = cos0 +isin@ 


en = 1 


Elementary functions 
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cos(0 — 7/2) = sin 


cos(20) = cos? 6 — sin? 0 


sin(A + B) = sin Acos B + cos Asin B 


sin Asin B = 


sin Acos B = 


Physical constants 


Planck’s constant h 
vacuum speed of light c 
permittivity of free space €ọ 
Boltzmann’s constant k 
electron charge —e 
electron mass Me 
Bohr radius ao 


5(cos(A — B) — cos(A + B)) 


1(sin(A — B) + sin(A + B)) 


(a > 0,6 > 0) 


Ina + lnb = In(ab) 


e” +e 
2 


cosh x = 


sin(@ + 7) = — sin 0 


sin(@ + 1/2) = cos 0 


sin(ð — 7/2) = — cos 0 


sin(20) = 2 sin 0 cos 0 


6.63 x 107°4J s 
3.00 x 108 m s71 
8.85 x 1071? F m“! 
1.38 x 10-23 J K7! 
—1.60 x 10719 C 
9.11 x 107°! kg 
5.29 x 1071 m 


|z|? = zz" =r +y =r? 
z” — rren? 
i0 _ e—ið 
sin 0 = - 
2i 
—in/2 i 
ena — In(e*) =a 
T = 
; —e 
sinh z = 
2 


tan( + 7) = tan 0 


tan(@ + 7/2) = — cot 0 


cos(A + B) = cos Acos B F sin Asin B 


cos? A + sin? A = 1 


Planck’s constant /27 h 
Coulomb law constant I 
0 

permeability of free space po 
Avogadro’s constant Nm 
proton charge e 
proton mass Mp 
atomic mass unit u 


tan(0 — 1/2) = — cot 0 


tan(20) = 2 tan 0/(1 — tan? 8) 


cos A cos B = 4 ( cos( A — B) + cos( A + B)) 


1.06 x 10-84 J s 
8.99 x 10° m Fo! 
4r x 1077H mat 
6.02 x 1073 mol! 
1.60 x 10719 C 
1.67 x 107? kg 
1.66 x 107?’ kg 
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inner product (fig) = f* (x)g(a) dx (flg) = (lf) 


(Eaa) = Y alla) (Sasi HD alalt) 


a 


Hermitian operator A, 


Cauchy—-Schwarz inequality (flAg ) = (Aflg ) (A Kaaya Kriol 

commutation relations [A, B] = AB-BA (3, Pa] an. 

generalized Ehrenfest theorem, 

generalized uncertainty principle M et ({A Ai] ) AAAB>? K B] )| 
dt ih : ~ 2 , 


Orbital angular momentum 


: oe i? 
classical quantities Lz = Xpy — YPr L= lw Bot = Ti p=IA=Y7L 
a ð ð ð a 

operators L, = —iħ E By Zl L; uT L = L,+ L, +L, 

commutation relations [Les La] = iħÎ,; [igs Lal = iL, [L Ee = ihly E’, L] =0 
eigenvalues P=ii+1" L,=mh 1=0,1,2,..., m=0,+1,... 

e ee | 

Spin angular momentum (spin-;) 

general spin matrix, s h | cos? e™®sinð cos(6/2) —e7!? sin(0/2) 
general eigenvectors Sn == | tar =I a= 

2 Jeb sing —cosé e'? sin(0/2) cos(0/2) 

. . h|O a h0 —i ~ hll 0 
spin matrices =x S= S=- 

2 2ji 0 2}0 —1 
commutation relations [Sa Sy] = ins, [Sa] = iħS, [BaBa] = inS, [5", S.] =0 
eigenvalues S? = s(s +1)? S, = msħ s=}, Me=+5 
energy levels Emag = p> B HU = YS fi = -y BSn 

Identical particles 
singlet spin ket zl TL) —|11)) = 10,0) 
triplet spin kets | TT)=[1,1) (114) +111))=10  [41)=11,-1) 
spin total wave function exclusion principle composite particle 
fermion s= 5 3, antisymmetric yes odd number of fermions 
boson s=0,1,2,... symmetric no even number of fermions 


Definite integrals for positive integers n and m 


[ f(x)dx =0 (f(x) an odd function) 
f sin(nx) sin(ma) dx = - fnm 
0 2 


(n+ m even) 


T/2 T 
f sin(nz) sin(mx) dz = = ônm 
—1/2 2 


cos“ z dz = a 


nt 2-52 
nT 
f £ cos? zdz = 
0 


nT 3-3 
n°? 1 nT 
x’ cos? rdg = —— + — 
0 6 


© 


4 


nr/2 nr 
f cos? x dz = a5 


—nr/2 


nm [2 3,3 
2 2 n n n 
oe p a 
Jan cos «dx z4 7 (—1) 


nm/2 22 
f x? cos g dg = (—1)("*9)/? g — 1), (n odd) 


—nr/2 


Fora>0, n>0, m= 1: 
oe 2 /n2 

/ e77 /“ dr = ayr 
=00 


ca 2/2 n! 
| gentle e/a dg = z (n > 0) 


0 


[melas _1x3x = (2n — 1) ath lg 
E for (n > 1) 


a f(z) dz = 2 a f(x)dx (f(x) an even function) 

—a 0 

[ cos(nx) cos(mx) dz = i Oi 
0 2 


a /2 T 
f cos(nx) cos(ma) dx = = dnm 


5 (n+ m even) 
—1/2 


nT 
i nT 
sin? z dz = — 
0 2 


nt Dead 
: n T 
f xsin? zdz = 
0 


an g nn nr 
x sinf zdz = —— — — 
0 6 


4 


—nn/2 
nn /2 3,3 
f r? sin? zde = ~~ — all 
—n7/2 24 4 
nn /2 
z? cosa da = (—1)"/? 2nn, (n even) 
—nn/2 
OO 
f xe "de =n! 
0 
os A is 2 
f e77 ekt dg = y/re™/^ (k real) 
00 


mi=1x2x---xn Ol=1 


